Detecting and treating a target from a moving platform

ABSTRACT

A method includes receiving, by the treatment system, during operation in an agricultural environment, one or more images comprising one or more agricultural objects in the agricultural environment, identifying, in real-time, one or more objects of interest from the one or more agricultural objects by analyzing the one or more images, wherein the analyzing results in a first object being identified as belonging to one or more target objects and a second object being identified as not belonging to the one or more target objects, logging one or more results of the identification of each of the one or more objects of interest and a corresponding treatment decision; and activating the treatment mechanism to treat the one or more target objects.

CROSS REFERENCE TO RELATED APPLICATIONS

This patent document is a continuation of and claims the benefit ofpriority to U.S. patent application Ser. No. 17/840,534, filed on Jun.14, 2022, which is a is a continuation of and claims the benefit ofpriority to Ser. No. 17/506,618, filed on Oct. 20, 2021, now U.S. Pat.No. 11,399,531, issued Aug. 2, 2022. The entire contents of thebefore-mentioned patent applications are incorporated by reference aspart of the disclosure of this application.

TECHNICAL FIELD

The present patent document relates to machine learning and roboticimplementation of agricultural activities.

BACKGROUND

Global human population growth is expanding at a rate projected to reach10 billion or more persons within the next 40 years, which, in turn,will concomitantly increase demands on producers of food. To supportsuch population growth, food production, for example on farms andorchards, need to generate collectively an amount of food that isequivalent to an amount that the entire human race, from the beginningof time, has consumed up to that point in time. Many obstacles andimpediments, however, likely need to be overcome or resolved to feedfuture generations in a sustainable manner.

To support such an increase in demand, agricultural technology has beenimplemented to more effectively and efficiently grow crops, raiselivestock, and cultivate land. Such technology in the past has helped tomore effectively and efficiently use labor, use tools and machinery, andreduce the amount of chemicals used on plants and cultivated land.

However, many techniques used currently for producing and harvestingcrops are only incremental steps from a previous technique. The amountof land, chemicals, time, labor, and other costs to the industry stillpose a challenge. A new and improved system and method of performingagricultural services is needed.

SUMMARY

Techniques for detection of and controlling growth of undesirablevegetation in a field are described.

In one example aspect, a method includes obtaining, by the treatmentsystem mountable on an agricultural vehicle and configured to implementa machine learning (ML) algorithm, one or more images of a region of anagricultural environment near the treatment system, wherein the one ormore images are captured from the region of a real-world whereagricultural target objects are expected to be present, determining, bythe treatment system, one or more parameters for use with the MLalgorithm, wherein at least one of the one or more parameters is basedon one or more ML models related to identification of an agriculturalobject, determining, by the treatment system, a real-world target in theone or more images using the ML algorithm, wherein the ML algorithm isat least partly implemented using the one or more processors of thetreatment system, and applying a treatment to the target by selectivelyactivating the treatment mechanism based on a result of the determiningthe target.

In another example aspect, a method performed by a treatment systemhaving one or more processors, a storage, and a treatment mechanism isdisclosed. The method includes receiving, by the treatment system,sensor inputs including one or more images comprising one or moreagricultural objects; continuously performing a pose estimation of thetreatment system based on sensor inputs that are time synchronized andfused; identifying the one or more agricultural objects as real-worldtarget objects by analyzing the one or more images; tracking the one ormore agricultural objects identified by the analyzing; controlling anorientation of the treatment mechanism according to the pose estimationfor targeting the one or more agricultural objects; and activating thetreatment mechanism to treat the one or more agricultural objectsaccording to the orientation.

In another example aspect, an apparatus is disclosed. The apparatus maybe used as an agricultural vehicle and comprises a processor and one ormore sensors that are configured to obtain sensor readings of anagricultural environment, analyze the one or sensor readings todetermine a target and activating a treatment mechanism to interact withthe target.

These, and other, aspects are described throughout the present document.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure will become better understood from the detaileddescription and the drawings, wherein:

FIG. 1 is a diagram illustrating an exemplary environment, according tosome examples.

FIG. 2 is a diagram illustrating an exemplary environment, according tosome examples.

FIG. 3A is a diagram illustrating an agricultural scene within ageographic boundary, according to some examples.

FIG. 3B is a diagram illustrating image acquisition and digitization ofa geographic boundary, according to some examples.

FIG. 3C is a diagram illustrating image acquisition and digitization ofobjects a geographic boundary measured across time, according to someexamples.

FIG. 3D is a diagram illustrating an interface for interacting with adigitized agricultural scene, according to some examples.

FIG. 3E is a diagram illustrating an interface for interacting with anagricultural scene, according to some examples.

FIG. 4 is a diagram illustrating an example agricultural observation andtreatment system, according to some examples.

FIG. 5 is a diagram illustrating an example vehicle supporting anobservation and treatment system performing in a geographic boundary,according to some examples.

FIG. 6 is a diagram of a vehicle navigating in an agriculturalenvironment, according to some examples.

FIGS. 7A-7C are block diagrams illustrating an exemplary method that maybe performed by a treatment system, according to some examples.

FIG. 8 is a diagram illustrating an additional portion of an exampleagricultural observation and treatment system, according to someexamples.

FIG. 9A is a diagram illustrating an example component of anagricultural observation and treatment system, according to someexamples.

FIG. 9B is a diagram illustrating an example component of anagricultural observation and treatment system, according to someexamples.

FIG. 10 is block diagram illustrating an exemplary method that may beperformed by a treatment system, according to some examples.

FIG. 11A is block diagram illustrating an exemplary method that may beperformed by a treatment system, according to some examples.

FIG. 11B is block diagram illustrating an exemplary method that may beperformed by a treatment system, according to some examples.

FIG. 12A is a diagram illustrating an exemplary labelled image,according to some examples.

FIG. 12B is a diagram illustrating an exemplary labelled image,according to some examples.

FIG. 13A is a block diagram illustrating an exemplary method that may beperformed by a treatment system, according to some examples.

FIG. 13B is a block diagram illustrating an exemplary method that may beperformed by a treatment system, according to some examples.

FIG. 14 is a diagram illustrating an example image acquisition to objectdetermination performed by an example system, according to someexamples.

FIG. 15A is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 15B is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 16 is a diagram illustrating capturing an action performed by anobservation and treatment system, according to some examples.

FIG. 17A is a diagram illustrating capturing action and treatmentpattern detection, according to some examples.

FIG. 17B is a diagram illustrating capturing action and treatmentpattern detection, according to some examples.

FIG. 17C is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 17D is a diagram illustrating capturing action and treatmentpattern detection, according to some examples.

FIG. 17E is a diagram illustrating capturing an action and a treatmentpattern, according to some examples.

FIG. 18 is a diagram illustrating axes of movement, rotation, anddegrees of freedom of a vehicle and components of an observation andtreatment system, according to some examples.

FIG. 19A is a diagram illustrating an example vehicle supporting anexample observation and treatment system, according to some examples.

FIG. 19B is a diagram illustrating an example vehicle supporting anexample observation and treatment system, according to some examples.

FIG. 20A is a diagram illustrating an example vehicle supporting anexample observation and treatment system performing in a geographicboundary, according to some examples.

FIG. 20B is a diagram illustrating an example vehicle supporting anexample observation and treatment system performing in a geographicboundary, according to some examples.

FIG. 21 is a diagram illustrating an example observation and treatmentsystem and components of the observation and treatment system, accordingto some examples.

FIG. 22 is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 23 is a diagram illustrating a treatment system observing anenvironment and performing actions in a geographic boundary, accordingto some examples.

FIG. 24 is a diagram illustrating an example configuration of a systemwith a treatment unit having an example configuration of a fluid sourceand fluid flow mechanisms.

FIG. 25 is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 26 is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 27 is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 28 is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 29 is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 30 is a block diagram illustrating an exemplary method that may beperformed by an agricultural observation and treatment system, accordingto some examples.

FIG. 31 is a block diagram of a system described in the presentdocument.

FIG. 32A is a block diagram of onsite platform.

FIG. 32B is a block diagram of an example implementation of a real-timeprocessing engine.

FIG. 32C depicts an example of a Machine Learning (ML) systemimplementation.

FIG. 32D shows examples of sensors equipped on an onsite platform.

FIG. 32E shows examples of inputs/outputs available on an onsiteplatform.

FIG. 33 is an example of offsite computational resources that providesupport for an automated weed elimination process.

FIG. 34 is an example of target processing.

FIG. 35A-35B show an example implementations of ML image processing forautomated implementation of agricultural activities.

FIG. 36 shows an example of an example method of detection andelimination of undesirable vegetation.

FIG. 37 shows an example of a hardware platform on which the techniquesdescribed in the present document may be implemented.

FIG. 38 shows an example of image preprocessing.

FIG. 39 shows an example of an initialization of a system mounted on anagricultural vehicle.

FIG. 40 is an example of a neural network for machine learning.

FIG. 41 is a flowchart of an example method of detection and control ofundesirable vegetation.

FIG. 42 is a flowchart of an example method of operation of an edgeserver.

FIG. 43 is a flowchart of an example of a calibration method.

FIG. 44 is a flowchart of an example method of image processing.

FIG. 45A is a flowchart of an example method of object detection.

FIG. 45B is a flowchart of an example method described in the presentdocument.

FIG. 45C is a flowchart of an example method described in the presentdocument.

DETAILED DESCRIPTION

In this specification, reference is made in detail to specificembodiments of the disclosure. Some of the embodiments or their aspectsare illustrated in the drawings.

For clarity in explanation, the present document has been described withreference to specific embodiments, however it should be understood thatthe disclosure is not limited to the described embodiments. On thecontrary, the disclosure covers alternatives, modifications, andequivalents as may be included within its scope as defined by any patentclaims. The following embodiments of the disclosure are set forthwithout any loss of generality to, and without imposing limitations on,the claimed disclosure. In the following description, specific detailsare set forth in order to provide a thorough understanding of thepresent disclosure. The present disclosure may be practiced without someor all of these specific details. In addition, well known features maynot have been described in detail to avoid unnecessarily obscuring thedisclosure.

In addition, it should be understood that steps of the exemplary methodsset forth in this exemplary patent can be performed in different ordersthan the order presented in this specification. Furthermore, some stepsof the exemplary methods may be performed in parallel rather than beingperformed sequentially. Also, the steps of the exemplary methods may beperformed in a network environment in which some steps are performed bydifferent computers in the networked environment.

Some embodiments are implemented by a computer system. A computer systemmay include a processor, a memory, and a non-transitorycomputer-readable medium. The memory and non-transitory medium may storeinstructions for performing methods and steps described herein. Variousexamples and embodiments described below relate generally to robotics,autonomous driving systems, and autonomous agricultural applicationsystems, such as an autonomous agricultural observation and treatmentsystem, utilizing computer software and systems, computer vision andautomation to autonomously identify an agricultural object including anyand all unique growth stages of agricultural objects identified,including crops or other plants or portions of a plant, characteristicsand objects of a scene or geographic boundary, environmentcharacteristics, or a combination thereof.

Additionally, the systems, robots, computer software and systems,applications using computer vision and automation, or a combinationthereof, can be configured observe a geographic boundary having one ormore plants growing agricultural objects identified as potential crops,detect specific agricultural objects to each individual plant andportions of the plant, determine that one or more specific individualagricultural object in the real world geographic boundary requires atreatment based on its growth stage and treatment history from previousobservations and treatment, and to deliver a specific treatment to eachof the desired agricultural objects, among other objects. Generally, thecomputer system provides computer vision functionality usingstereoscopic digital cameras and performs object detection andclassification and apply a chemical treatment to target objects that arepotential crops via an integrated onboard observation and treatmentsystem. The system utilizes one or more image sensors, includingstereoscopic cameras to obtain digital imagery, including 3D imagery ofan agricultural scene such as a tree in an orchard or a row of plants ona farm while the system moves along a path near the crops. Onboardlights sources, such as LEDs, may be used by the system to provide aconsistent level of illumination of the crops while imagery of the cropsis being obtained by the image sensors. The system can then identify andrecognize different types of objects in the imagery. Based on detectedtypes of objects in the digital imagery, or the same object from onemoment in time to another moment in time experiencing a different growthstage which can be recognized, observed, and identified by the onsystem, as well as the system associating the growth stage or thedifferent label with a unique individual agricultural object previouslyidentified and located at previous growth stage, the system can apply atreatment, for example spray the real-world object with chemicals pumpedfrom one or more liquid tanks, onto a surface of the agriculturalobject. The system may optionally use one or more additional imagesensors to record the treatment, as a projectile, as it is applied fromthe system to the agricultural object in proximity to the system.

Referring now to FIG. 1 , a diagram of an exemplary network environmentin which example systems and devices may operate is shown. In theexemplary environment, clients 141 are connected over a network 145 to aserver 150 having local storage 151. Clients and servers in thisenvironment may be computers. Server 150 may be configured to handlerequests from clients. Server 150 may be implemented as a number ofnetworked server devices, though it is illustrated as a single entity.Communications and transmissions between a base station and one orvehicles, or other ground mobility units configured to support a server150, and between a base station and one or more control centers asdescribed herein may be executed similarly as the client 141 requests.

The exemplary environment is illustrated with only two clients and oneserver for simplicity, though in practice there may be more or fewerclients and servers. The computers have been termed clients and servers,though clients can also play the role of servers and servers can alsoplay the role of clients. In some examples, the client 141 maycommunicate with each other as well as the servers. Also, the server 150may communicate with other servers.

The network 145 may be, for example, local area network (LAN), wide areanetwork (WAN), networks utilizing 5G wireless standards technology,telephone networks, wireless networks, intranets, the Internet, orcombinations of networks. The server 150 may be connected to storage 152over a connection medium, which may be a bus, crossbar, network,wireless communication interface, or other interconnect. Storage 152 maybe implemented as a network of multiple storage devices, though it isillustrated as a single entity. Storage 152 may be a file system, disk,database, or other storage.

In one example, the client 141 may perform one or more methods hereinand, as a result, store a file in the storage 152. This may beaccomplished via communication over the network 145 between the client141 and server 150. For example, the client may communicate a request tothe server 150 to store a file with a specified name in the storage 152.The server 150 may respond to the request and store the file with thespecified name in the storage 152. The file to be saved may exist on theclient 141 or may already exist in the server's local storage 151.

In another embodiment, the client 141 may be a vehicle, or a system orapparatus supported by a vehicle, that sends vehicle sensor data. Thismay be accomplished via communication over the network 145 between theclient 141 and server 150. For example, the client may communicate arequest to the server 150 to store a file with a specified file name inthe storage 151. The server 150 may respond to the request and store thefile with the specified name in the storage 151. The file to be savedmay exist on the client 141 or may exist in other storage accessible viathe network such as storage 152, or even in storage on the client (e.g.,in a peer-to-peer system). In one example, the vehicle can be anelectric, gasoline, hydrogen, or hybrid powered vehicle including anall-terrain vehicle, a truck, a tractor, a small rover with bogey rockersystem, an aerial vehicle such as a drone or small unmanned aerialsystem capable of supporting a treatment system including visioncomponents, chemical deposition components, and compute components.

In accordance with the above discussion, embodiments can be used tostore a file on local storage such as a disk or solid-state drive, or ona removable medium like a flash drive. Furthermore, embodiments may beused to store a file on an external storage device connected to acomputer over a connection medium such as a bus, crossbar, network,wireless communication interface, or other interconnect. In addition,embodiments can be used to store a file on a remote server or on astorage device accessible to the remote server.

Furthermore, cloud computing and edge computing is another example wherefiles are often stored on remote servers or remote storage systems.Cloud computing refers to pooled network resources that can be quicklyprovisioned so as to allow for easy scalability. Cloud computing can beused to provide software-as-a-service, platform-as-a-service,infrastructure-as-a-service, and similar features. In a cloud computingenvironment, a user may store a file in the “cloud,” which means thatthe file is stored on a remote network resource though the actualhardware storing the file may be opaque to the user. Edge computingutilizes processing, storage, transfer, and receiving data at a remoteserver more local to where most, or a desired portion of the data may beprocessed, stored, and transferred to and from another server, includinga central hub or at each geographic boundary where data is captured,processed, stored, transmitted, and received.

FIG. 2 illustrates a diagram 200 of an example system 100 configured toobserve a geographic boundary in the real-world, for example a farm ororchard, perform object detection, classification, identification, ofany and all objects in the geographic boundary including agriculturalobjects, determine any individual agricultural object that may requirean agricultural treatment based on the agricultural object's growthstage, previous treatments applied, and other characteristics observed,particularly at the point in time of the observation by system 100, andapply a specific treatment to the agricultural object. The system 100can include and object observation and treatment engine that includes animage capture module 104, a request module 106, a positional data module108 for capturing, fusing, and transmitting sensor data related toposition, localization, pose, velocity, and other position relatedsignals to the rest of the system 100, a vehicle module 110, adeposition module 112 for applying a liquid or light treatment on eachindividual object detected and determined to require a treatment, atargeting module 114 for targeting and tracking an identified object inthe real-world based on sensor data and object detection in an imagecaptured of the real-world while a vehicle is moving, and a userinterface (U.I.) module 116. The system 100 may communicate with a userdevice 140 to display output, via a user interface 144 generated by anapplication engine 142. In one example, the deposition module 112 canalso be a treatment module configured to perform non fluid typedeposition treatment including having a mechanical mechanism or endeffector, including mechanical arms, blades, injectors, drills, tillingmechanism, etc., that physically interacts with surfaces or roots ofplant objects or soil.

The system 100 can also include an image processing module 130, eitheron board a vehicle supporting the system 100, part of the system 100,embedded in the system 100, or supported by one or more servers orcomputing devices remote from the vehicle supporting the system 100. Theimage processing module 130 can be configured to process any and allimages or other sensor data captured by the system 100 including featureextraction, object identification, detection, and classification, imagematching, comparing, and corresponding with other images receivedsimultaneously or previously of the same location, labelling uniquefeatures in each of the images, as well as point clouds from variousother sensors such as that of lidars, or a combination thereof.

While the databases 120, 122 and 124 are displayed separately, thedatabases and information maintained in a database may be combinedtogether or further separated in a manner that promotes retrieval andstorage efficiency and/or data security.

FIG. 3A illustrates a diagram 300 a depicting an agricultural scene. Theagricultural scene can be any physical environment in the real-worldused for agriculture such as, but not limited to, a farm or orchard. Theagricultural scene can be contained in a regional geographic boundary ora region without any defined boundaries. The agricultural scene caninclude agricultural objects including a plurality of one or moredifferent types of plants objects having different plant phenologydepending on the season or year on the same agricultural scene. Theagricultural objects can be further observed and categorized based oneach plant anatomy. For example, diagram 300 a can illustrate an orchardhaving permanent plants, such one or more trees 303. These trees 303 canbe permanent trees that can produce crop such as fruit trees or nuttrees in seasonal or yearly cycles for multiple years. The plants canalso be row crops for harvesting where the plants themselves are forharvest. The agricultural objects observed and potentially treated canbe further categorized and identified by the anatomy of the specifictype of tree 303. For example, a plant such as a tree 303 can include atrunk, root, branch, stems, leaves, pedals, flowers, plant pistils andstigma, buds, fruitlets, fruits, and many other portions of a plant thatmake up the plant's anatomy, all of which can be agricultural objects ofinterest for observation and treatment. For example, the tree 303 indiagram 300 a can include one or more agricultural objects 302. Theseobjects can include fruiting flowers or fruitlets that an agriculturaltreatment system can detect and identify in real-time, and perform anaction to treat the flower or fruitlet.

The agricultural scene can also include an agricultural observation andtreatment system 311, supported by an example vehicle 310, performingobservations and actions in the agricultural scene. In one example, thevehicle 310 can travel inside an orchard along a path 312 such that theagricultural observation and treatment system 311 can sense, identify,perform actions on specific agricultural objects 302 in real time, andindex and store the sensed objects 302 and action history, such that theobservation and treatment system 311 can use the previously storedinformation about the specific object 302 that was observed and treatedfor its next treatment upon detection at a later time or a laterphenological stage of the specific object 302. The agriculturalobservation and treatment system 311 itself can be a component orsubsystem of a larger system that can perform computations, store anddisplay information, make decisions, and transmit and receive data froma plurality of agricultural observation and treatment systems performingobservations and actions on a plurality of geographic scenes. The largersystem can manage a mesh network of individual agricultural observationand treatment systems, each performing online, and onboard a vehicle, inone or more geographic regions, and a mesh network of servers and othercompute devices in the cloud or edge to perform real time functions,quasi real-time functions or support functions for each onlineagricultural observation and treatment system, or offline at one or moreservers to analyze data such as sensor data, performance activity,perform training one or more machine learning models, updating machinelearning models stored on one or more of the agricultural observationand treatment systems located at various geographic regions, as well asa plurality of other tasks and storage capabilities that can generallybe performed or maintained offline from the online and real timeperforming treatment systems. Various examples of agriculturalobservation and treatment systems, or components of modular agriculturalobservation and treatment systems are described in further detail belowin this disclosure.

In one example, the agricultural scene can be that of an orchard havinga plurality of fruiting trees planted in rows as illustrated in diagram300 a. The rows can be further partitioned and categorized by zones 304.In this example, the treatment system 311 can perform a differentvariety of chemical treatments with varying treatment parameters, suchas chemicals used, chemical composition, treatment frequency, andperform A/B type testing (A/B testing) on the agricultural scene bydifferent zones of the same plant type, different chemical trials in thesame or different zones or by different individual plant object forharvest, or a combination thereof. The A/B testing for best treatment orbest trial discovery can performed at a microarray level such thatvarying chemical types can be used in real time and varying chemicalcompositions and concentrations can be used in real time. Thesecombinations can go up to over a million different combinations ofdifferent compositions, concentrations, volume, frequency of chemicaltreatment on varying plant varieties at different stages of growth. Inone example, the agricultural observation and treatment system 311 canapply and log each of these different possibilities of varying treatmentparameters and perform A/B testing on each zone, each tree, or each croplevel specificity to determine the optimal treatment process for eachplant or crop type that has not been previously identified in theindustry. For example, as the agricultural observation and treatmentsystem 311 applies different treatment parameters to different objectsin the same geographic region throughout the growing cycle, uponharvest, some fruiting objects will have more desirable traits andcharacteristics as that of others, of the same type of crop. Theagricultural observation and treatment system 311 can determine whichexact object treated and logged from the beginning of the grow cycle forthat particular object of the crop, determine the objects specifictreatment history, including treatments used, concentration, volume,frequency, etc. and determine that the particular treatment processbased on the treatment history of that particular object, that fruitedinto the most desired version of the crop, is the optimal process basedon the A/B testing.

Additionally, based on the zone 304 of plants that produces the bestcrops, or the best crop at the individual object or fruit level of eachzone 304, the best crops being based on size, health, color, amount,taste, etc. crop, the agricultural observation and treatment system 311can determine the best method of performing treatment actions, based ona variety of parameters that can be adjusted and customized, and applythe same method of treatment actions on the particular zone 304 thatyielded the best crop, for other crops in a new or subsequent cropcycle. In one example, treating each agricultural objects with adifferent treatment parameter to determine the best method of treating acrop does not have to be partitioned by zones 304. The agriculturalobservation and treatment system 311 can identify, tag, observe, logeach unique agricultural object 302 and treat each agricultural object302 of interest at the individual agricultural object level. Forexample, instead of treating a first zone 304 with a certain amount ortype of chemical of each agricultural object and treating a second zone304 with a different amount or type of chemical, the treatment system311 can treat a first agricultural object 302, such as plant bud, and asecond agricultural object 302, a different plant bud at the same stageof growth as that of the first plant bud, to observe and discover whichbud yields the better fruit.

In one example, the agricultural scene is an orchard having a pluralityof rows and trees planted in each row. The vehicle 310 can autonomouslytravel through each row such that the treatment system 311 can scan oneor more trees 303 along a path of the vehicle to detect variousagricultural objects including agricultural objects 302 for treatment.Once the treatment system 311's sensing system senses a potentialagricultural object, the system 311 can determine whether theagricultural object 302 detected is a new object identified for thefirst time, a previously identified, tagged, and stored object detectedagain, a previously identified, tagged, and stored object detectedagain, that has changed its state or stage of growth in its phenologicalcycle, a previously identified object that has moved or changed inanatomy, or other objects with varying characteristics detected such asstage of growth, size, color, health, density, etc. Once the object isdetected in real time, whether it is of an object previously identifiedand mapped onto a virtual agricultural scene representing the realagricultural scene, the treatment system 311 can determine, based on acombination of determining the agricultural object's identity,phenotype, stage of growth, and treatment history, if any, whether toperform a unique action onto the agricultural object 302 identified. Theaction can be an interaction between a treat unit of the treatmentsystem 311 that can interact with a target, including preparing achemical fluid projectile emitted from a device or treatment unit aspart of the treatment 311 directly onto a portion of a surface of theagricultural object 302. The fluid can be a single liquid projectilesimilar to that of a shape of a water droplet emitted from a watersprayer, a mist or aerosol, a volumetric spray across a period of time,or many other types of fluid that can be emitted from a device discussedlater in this disclosure.

The actions performed by the observation and treatment system 311 can beperformed for the purposes similar to that of many actions typicallyperformed in agriculture. These actions can include soil and fertilizerdeposition, emitting seeds from the treatment system 311 into soil ordirt, treating individual plant objects including thinning, weeding,pollinating, pruning, extracting, harvesting, among many other actionsthat can be performed by a treatment system 311 having a deviceconfigured to sense an individual object and its stage of growth, accessits treatment history, and perform a physical action including emittinga fluid, small object, or shine a light source such as a laser onto theindividual object, physically manipulate the object including removingor moving the object for better sense and treatment of another object,destroying the object, pruning or harvesting the object, or acombination thereof.

In one example, the agricultural scene and geographic region can be afarm where the ground or terrain is partitioned into a plurality of rowswith row crops for planting, growing, and harvesting and the plantsthemselves are harvested, unlike that of orchards where agriculturalobjects are harvested from permanent plants. The observation andtreatment can be observed and performed on the crops themselves, or ofother plants of interest. For example, weeds can grow in the sameagricultural scene as that of a crop of interest such that theobservation and treatment performed by treatment system 311 can be thatof both the crop and the one or more different types of weeds, or justthe weeds. In another example, the agricultural scene can be that of afarm, orchard, or any kind of ground terrain that does not yet have anytrees or crops, but only of dirt and soil.

FIG. 3B and FIG. 3C illustrate diagrams 300 b and 300 c each depicting aportion of a virtual and digitized agricultural scene or area similar tothat of agricultural scene in diagram 300 a. Diagram 300 b can depict avirtual scene generated by an agricultural observation and treatmentsystem similar to that of agricultural observation and treatment system311 or at servers, cloud, or edge computing devices connected to anagricultural observation and treatment system operating and acquiringimages and other perception data of the agricultural scene. The virtualscene can be that of a high definition 2D map, 3D map, or both, of anagricultural scene surveyed, observed, treated, logged, or a combinationthereof, by a treatment system, such as treatment system 311. Thetreatment system 311, having perception and navigation related sensorsand a plurality of modular treatment modules each having its ownsensors, including vision and navigation sensors, compute units,treatment devices or units, illumination devices, can be supported by avehicle 310 that can drive along a path, and can be configured to scanand observe a geographic scene and build a virtual map of the scene.

In general, the vehicle 310 moves along a path in the real world whilethe agricultural observation and agricultural treatment system 311obtains imagery and other sensed readings, including images captured byimage capture devices or point clouds captured by LiDAR's, or aplurality of different sensor readings captured by a plurality ofdifferent sensors, of the external environment. The observation andtreatment system can generate points along the path representingexternal agricultural objects (e.g., plants, crops, trees, debris,patterns, landmarks, keypoints or salient points, patterns, cluster offeatures or patterns that are fixed in space, etc.).

For example, as the vehicle 310 passes by a particular agriculturalobject in the real world, the object determination and object treatmentengine can capture images and reconstruct a digital or virtualgeographic scene representing the geographic scene as illustrated bydiagram 300 b. The diagram 300 b can include a plurality of mapped datapoints representing agricultural objects or clusters of objects,including objects that have been treated, objects for observation,objects indexed for marking location of nearby objects in the overallgeographic scene or the object itself in the global scene, or acombination thereof, as well as landmarks, patterns, regions ofinterest, or a combination thereof. The mapped points depicted byobjects 320, which can include agricultural objects for treatment, withdifferent identifiers based on the phenology or stage of growth of eachindividual objects. The points depicted by objects 320 can be generatedand/or represented by images taken in the real world of the scene,patches of images, lidar point clouds or portions of points clouds, 3Dimages modelled by various imaging techniques such as visualconstructions of objects in computer vision, including structure frommotion or 3d model reconstruction from a single, stereo, multi cameraconfiguration, the cameras being color sensors, black and white sensors,multispectral sensors, or a combination thereof. Multiple images of thesame scene or same object can be combined and analyzed as theagricultural observation and treatment system 311 scans and observes theenvironment multiple times throughout a grow season or year. Eachobject, cluster of objects, or landmarks detected can have a pluralityof layers of images or other sensor readings, such as radar and lidarpoint cloud readings, to form high resolution 2D or 3D reconstructedmodels of the real-world objects detected. In one example, a stereovision system in an image capture module can capture images of objectsin space and superimpose views of the objects captured in the framescaptured in time in stereo into a 3D model of the object. In oneexample, the generated 3D model of the objects detected, includingagricultural objects, with different models at each of its differentdetected and labelled growth stages, can be positioned in the virtualgeographic boundary or geographic scene for a user to scan through a seevia the user interface described below. In one example, a geographicscene can be a fruit orchard having multiple trees 303 and agriculturalobjects 302. The agricultural observation and treatment system 311 canobserve the geographic scene, both in real time via compute unitsdisposed on board the vehicle 310 or edge or cloud compute devices oroffline. The system can generate a digital or virtual map of the scenehaving a plurality of objects 320, clusters of objects 320 that canrepresent a portion of an entire plant 330, for example a tree. Eachobject 320 can include agricultural objects such as fruits, buds,flowers, fruitlets, or other object types of objects that can be treatedby a treatment system, depending on its stage of growth or phenology.

The objects 320 can be digitally indexed objects having a type,identity, stage of growth, and location associated with the objects,which can be represented by individual images, stereo pair images, orportions of images of the real-world equivalent object captured by animage capture device. The object 320 may have an associated geo-graphicdata associated to the object, including position data, orientation andpose estimation relative to the geographic boundary view or relative tophysical components of the agricultural treatment system 400, includingimage sensors, or treatment engines, or relative to other agriculturalobjects. In one example, each of the objects 320 can include images thatare full frame images captured by one or more cameras in theagricultural treatment system. The full frames can be 2D or 3D imageshowing the images captured directly by one or more cameras and/orrendered by the agricultural treatment system 311. The images caninclude images captured a few meters away from the physical surface andposition of agricultural objects in the geographic boundary, which caninclude images of a plurality of individual agricultural objects, thatare potential crops, as well as landmarks including objects or scenery,or other objects of interest including calibration targets and markersor other farming equipment, devices, structures, or machinery typicallyfound on a farm that can be detected for localization of the treatmentsystem and tracking objects in real time and for constructing a map of ascene either in real time or offline. The objects 320 can also includespecific patches within captured full frame images. For example, astereo pair of cameras can each, simultaneously, capture ahigh-resolution image having hundreds of agricultural objects includingtarget objects for treatment, potential objects for observation fortreating at a future time, landmarks, other patterns, etc. The patchescan be identified by the agricultural system by detecting, classifying,identifying and extracting features, and labelling specific portions ofa full image frame, including labelling agricultural objects andspecific stages of growth of agricultural objects. The portion of thefull frame image can be extracted as a patch such that each individualpatch, which itself is a portion of the full frame image, is a visualrepresentation of each individual and unique agricultural object on thegeographic boundary, and can be identified and indexed, associated withits position data, any and all treatment history if any of the objectsdetected are objects detected from a previous trial on the specificmarked and identified agricultural object, as well as timestampsassociated with the image captured and data acquisition, positioncaptured, treatment applied, or a combination thereof. In one example,each object 320 in the virtual construction of a scene depicted bydiagram 300 b can be a point of varying size depending on the actualsize of the object in the real world, or the size of the patchassociated with the object. In another example, each object 320 in thevirtually constructed scene can be represented by the real-world objectvirtual 3d model associated with each of the objects 320. For example,each of the thousands or millions of objects, landmarks, or patterns canbe visually represented as 2d or 3d models of each of the specificobjects in the real world. Thus, a map, either a 2d or a 3d map can begenerated and accessed, visually, illustrating each object, landmark,pattern or region of interest, in the real world such that each object'sand/or landmark's visualization, structure, location, treatment andprediction details can be represented and displayed in the map.

A user interface can be accessed to interface with the digitally mappedagricultural scene such that a user can view images, models, model ofthe specific object, spray history, other data related to agriculturalobjects including predictions related to yield, size, health, disease,etc., of each object observed and or treated associated with itslocation in the real world based on its location in the digital map. Forexample, a cluster of objects 320, including fruits and fruitlets of afruiting tree, located in a specific area in an orchard, for example ina specific zone 304 in the real world, can be mapped to a specific zone340 in the map, such that the cluster of objects 320, accessed by auser, in the map also represents the location of the specific real-worldobjects associated with objects 320 of the virtual map. The userinterface is further described in detail below.

In one example, as illustrated in diagram 300 c, a user interface 350may be accessed. The user interface 350 can show the points representingeach of the objects 320 of the virtual map. The user interface mayprovide for the user selection of the any of the points depicted in thevirtual scene or map. For example, a unique object 353-a, which in thisexample can be an individual and unique fruitlet in an orchard, can bedetected in the orchard by the agricultural observation and treatmentsystem 311. The object 353-a can be detected in one or more images froma system supported by a moving vehicle. A patch of the image can beidentified and extracted depicting just the object 353-a. Andadditionally, a 3D constructed model of the specific fruitlet, that isobject 353-a at the time of capture by the observation and treatmentsystem, can be generated by using one or more computer vision techniquesfrom associated multiple different views of the same object 353-a andassociating each view with location data of object 353-a and motion ofthe image capture device, or by implementing other computer visiontechniques.

Upon selection of a specific point in the user interface, for example apoint of the map representing object 353-a, the user interface 350 candisplay multiple visualizations and information associated with theselected object 353-a. For example, the user interface 350 can displayan image 352 associated with the point selected, as for this caseselecting the point associated with object 353-a. The image 352 can be apixelated 2-dimensional or 3-dimensional image with localizationinformation. The image 352 can also include the specific image patchextracted from the images captured by the treatment system, instead of aconstructed image depicting the shape, size, color, or other uniqueattributes of the object 353-a at the time the observation and treatmentsystem last observed the object 353-a. In one example, the image 352 canbe a pixelated 2d or 3d image that represents a model of the specificobject 353-a in the real world at a stage of growth, or state, detectedby the treatment system. The 2d or 3d model can be generated by usingvarious computer vision techniques by associating multiple views of theobject along with depth and/or motion data of the image capture device.Some objects may be occluded such that an image sensor travelling alonga path guided by a vehicle may not capture the entire view of the objectdetected. For example, the object could be hidden further inside of atree instead of growing closer to the outer surface of a tree. Theobject could have other objects blocking portions of the object fromview such as leaves or other agricultural objects or landmarks ofinterest. In this example, one or more machine learning algorithms canbe applied to process the existing images sensed on the object, whetherthrough a single pass of the image sensor in a single run, capturing aplurality of image frames forming a single video, or in stereo, or byanalyzing image frames from multiple passes, each image frame whethercaptured in a single sequence or multiple sequences from multipletrials, each image frame having at least a portion of a view of theobject. The machine learning algorithm can be used to compensate for theoccluded portions of the object and construct a high resolution 2D imageor 3D model of the object. In cases where the phenological stage of theobject changes in time, the agricultural observation and treatmentsystem 311 can generate multiple constructed images or models depictingdifferent stages of growth of the same object with timestampsassociating which stage of growth was detected and constructed and therelationship between the different constructed images and models of thesame object, for example showing the phenological changes from oneconstructed image of the object to the next constructed image of thesame object. In one example, the computer vision techniques can beperformed using machine learning models and algorithms, either embeddedon board the agricultural observation and treatment system 311, oroffline via edge or cloud computing device. The user interface 350 canalso display multiple views of the same patch associated with a specificobject selected by the user in the virtual map. For example, theagricultural observation and treatment system 311 can capture more thanone view of object 353-a and store all of the different frames thatinclude object 353-a. The system can generate, index and store each ofthe individual patches of images depicting different views of the objectand display it in the user interface 350 as a group of images 354 forthe user when the specific object, for example object 353-a, in thevirtual map is selected.

Additionally, the user interface 350 can display a time series lapse ofthe history of images captured on the same object, the imagespartitioned based on state changes, stage of growth or phenologicalchanges, including bud, leaf, shoot, flower/blossom, fruitingdevelopments, and maturing developments of the same object detected. Forexample, object 353-a can be fruitlet of an apple fruit, detected as anindividual object in the real world with location, position, and/ororientation data, relative to some arbitrary point in the orchard, theexact point in the real world, a relative position to the agriculturalobservation and treatment system 311 such that the treatment systemitself has a location and orientation data in the real-world. And theobject detected at the time can be identified as a fruitlet of afruiting tree. However, the object 353-a will have been a flower earlierin its life cycle at a prior time, and a bud in its life cycle at aneven prior time to flowering, while still being at the same, or close tothe same location and position in the orchard, and location and positionrelative to a defined position of the tree supporting the object 353-a.For example, a portion of the tree, such as a base of the trunk or anarbitrary center point of the tree can be defined with (x₀, y₀, z₀)position data. The object detected can be some (Δx₀, Δy₀, Δz₀) positionrelative to the base or arbitrarily chosen position (x₀, y₀, z₀) in thegeographic scene, zone of the geographic scene, or a particulartree/plant, for example a position (x₁, y₁, z₁). The coordinate systemchosen is just an example wherein a plurality of different coordinatesystems and origin points can be used to locate relative positions andorientations of objects relative to other objects.

In this example, the agricultural observation and treatment system 311may have logged information related to object 353-a before it waslabelled as a fruitlet in its most recent timed log. The system may havedetected the same object 353-a, at the same or near location, when itwas detected with an identifier of a bud associated with the object. Andthen at a later trial detecting the same object 353-a again, but beforeit was detected and labelled with the identifier of fruitlet but afterit was detected and labelled as a bud, another detection with theidentifier, for example, of a flower/blossom. Each of these detectionscan have location position of, or near location position of (x₁, y₁,z₁). In this case, the system can associate the differentidentifications of the same object, based on the objects state changes,or stage of growth or phenological changes, and display, via a series ofviews across time, the state change in sequence in the user interface350. Identifying, storing and indexing, and associating portions ofimages and patches and other sensor readings of objects of the same typewith near or the same locations of the same objects identifiedthroughout time from different trials and identifying with differentstates of the same object in the geographic scene can be performed usingvarious techniques including machine learning feature extraction,detection, and or classification to detect and identify objects in agiven image frame as well as generating keyframes based on the objectsand landmarks detected. The keyframes can be determined to moreefficiently identify and index objects in a frame while reducingredundancy, for example, by identifying common and/or the same landmarksacross multiple frames. The machine learning and other various computervision algorithms can be configured to draw bounding boxes to labelportions of images with objects of interest and background, maskingfunctions to separate background and regions of interest or objects ofinterest, perform semantic segmentation to all pixels or a region ofpixels of an given image frame to classify each pixel as part of one ormore different target objects, other objects of interest, or backgroundand associate its specific location in space relative to the a componentof the treatment system and the vehicle supporting the treatment system.

Additionally, the agricultural observation and treatment system 311 canperform functions to associate portions of image, for example imagepatches of objects, image frames, key frames, or a combination thereoffrom different trials where the agricultural observation and treatmentsystem 311 observed, identified, labelled, and stored information aboutthe same object across multiple states and phenological stages.Additionally, the association of the frames or portion of the frames canbe packaged into a series of image frames that can be displayed insequence as a video displaying the growth, or backwards growth dependingon the direction of displaying the images, of the specific object. Forexample, the series of indexed images or patches of images associatedwith each other throughout time can be displayed in the user interface350 in the video or visual time lapse history 356. In one example, thefunctions can be performed by various computer vision and machinelearning techniques including image to image correspondence includingtemplate matching and outlier rejection, performed by various techniquesincluding RANSAC, k-means clustering, or other feature-based objectdetection techniques for analyzing a series of images frames, or acombination thereof. In one example, the above techniques can also beused to generate key frames of a subsequent trial by comparing framesfrom the subsequent trial with keyframes of one or more prior trials,depending on how many prior trials there are. Additionally, thecomparison and the candidate frames or keyframes from a previous trialthat may be accessed by the agricultural observation and treatmentsystem 311, or at a server offline, to be used to perform comparisons toidentify state and phenological stage change of a same object, such asobject 353-a, can be narrowed down for selection based on location datalogged at the time of capture, pose data logged at the time of capture,or a combination thereof associated with each of the keyframes, orobjects detected in each keyframe. These accessed and selected frames orkey frames in the prior trials, having been selected based on itslocation data associated with the frames or objects detected in theframes, can be used to compare with currently captured frames, orsubsequently captured frames from the prior frames, having similarlocation data associated with the selected frames for key frames tomatch objects that may have different labels, since different states orphenological stages will have different labels due to the states havingdifferent shape, color, size, density, etc. If there is a match, or ifthere is a threshold reached based on the comparison of the accessedframe or keyframe against one or more frames in a subsequently capturedseries of frames, the agricultural observation and treatment system 311can determine that the two, or more, objects of different types andidentifiers associated with each of the objects, are the same object andthat one, with a first phenological stage, changed into the other havinga second label or identifier of a second phenological stage.

In one example, the agricultural observation and treatment system 311can run a machine learning detector on portions of images, to detectobjects of interest in each of the portions of images by performingfeature extraction and generating bounding boxes around an object ofinterest, performing semantic segmentation or performing semanticclassification of each pixel of a region of an image to detect objectsof interest, or a combination thereof. One or more key frames from aprior trial or in a prior frame of captured frames from a subsequenttrial, any trial subsequent to the prior trial, can be propagated intocandidate frames captured in the subsequent trial, for example thesubsequent trial being the most recent trial, or current trial, withimages captured that have not yet been processed and indexed. The frameswith the propagated detections or labels can be used to detect whether amachine learning detection was accurate, whether location dataassociated with the frame is accurate, or detecting other outliers, as athreshold to mitigate false positives and false negatives of featuresdetected but doesn't actually exist or missing but should have beendetected, since these frames will have similar location data associatedto each of the other frames and detecting and corresponding, above acertain threshold, the same features of more than one frame in a sametrial or across frames having the same location from previous trials maygive more confidence that the features detected are real and accurate.

The series of images can be each of the patches displayed in order,similar to that of a video display of images, where a user can view thechanges in state of the object from when it was bud, through variousgrowth stages, including small incremental stages from day-to-day in agrowing season, until fruiting. In one example, a visual time lapsehistory 356 of object 353-a as it was detected as a bud, to a flower, toa fruitlet, and to a fruit, incrementally, can be displayed in the userinterface 350. Additionally, the series of images displayed in thevisual time lapse history 356 of an object, for example object 353-a,can be reconstructed or generated images from combining multiple imagesdepicting the same object or landmark. These images would be machinelearning rendered images generated by associating portions of capturedimages as well as portions of generated pixels of an image generated bya machine learning model to display a better representation of theselected object to the user. This can include generating higherresolutions from upscaling portions of the captured images whenanalyzing portions of the captured images to generate an image displayedto the user or generating views of an object detected that wereotherwise occluded in the captured images. In this example, uponselection of an object in the user interface, the user interface 350 candisplay any captured images 352, where the image itself can containsmaller patches within the image 352 containing views of objects, ordisplay one or more rendered images generated from a plurality ofcaptured images associated with the object selected.

In one example, the visual time lapse history 356 can be used tovisualize the state changes or visualize its real-world growth fromsprouting into crop, or from bud into fruit, depending on the type ofcrop. This would give the effect, in some instances, of displaying agrowth sequence of an agricultural object from a dormant phase, to afully grown crop of the same or substantially the same location. Thelocation would not be the exact since the object will grow and droplower due to its weight or can be externally moved by wind.Alternatively, the visual time lapse history 356 can function as a “timemachine” visualization of the object. The visual time lapse history 356can be viewed in reverse time to view what a currently detected, orotherwise the object's current state in the real world, assuming theagricultural observation and treatment system has captured and detectedthe object in its current or proximately current state, looked like inthe past by visually linking captured sensor readings, including imageframes having views of the object, and displaying them in sequence, suchas a video.

The agricultural observation and treatment system can associatesimilarities from an image frame, or a portion of one or more patcheswithin the frame corresponding with an object of interest captured at afirst time, with another frame captured at a second time that is closein proximity to the first time, for example a day, such that the statechange will be minor and the system can combine location data of theobjects detected in frames from frame to frame across time having a samelocation associated with the object, so the system can have moreconfidence that, for example, an object from a first frame associatedwith a first timestamp is a same object from a second frame associatedwith a second timestamp, because the real world location of both objectsfrom different frames and different timestamps are in proximity witheach other above a certain threshold for the system to determine thatthe two objects are the same. Additionally, the system can determine anyrelationship between the images of the same object, such that one objectturned from one state detected in the first frame to the other statedetected in the second frame. The incremental changes can allow theimage correspondence to reach a certain threshold of confidence suchthat matching an object of a first phenological stage with an object ofa second phenological stage as the same object does not have to rely onits detected spatial proximity in the real-world location associatedwith the object when the object was identified with their respectivecaptured frames and timestamps.

In one example, the user interface 350 can store and display a varietyof information, data, logs, predictions, histories, or other informationrelated to each object. The information can be displayed to a user uponselection of information, or upon selection of the object in aninteractive virtual map. In one example, the user interface 350 candisplay a visualization 358 of various data including data related to anobject's treatment history, observation history, or both. This caninclude information about each of the times the particular objectedselected, for example object 353-a, was detected in the real world andindexed. In one example, the detection of an object across multipleframes or sensor readings in a single trial can be categorized andindexed as a single detection. If a treatment was applied, for example aspraying of a substance, a mechanical interaction with the object with aphysical end effector contacting object 353-a, or any kind of actionother than a treatment that physically affects the object, can be loggedin time and location. As the agricultural observation and treatmentsystem 311 performs multiple trials across period of time, the system311 can associate each observation and/or treatment of a same objectwith each other, and display the information related to observations andtreatments in order. The information can include the type of spray ortreatment used, the length of time of the spray or treatment, the timeassociated with the treatment, timestamp, the phenological stage of theobject detected. This can allow the agricultural observation andtreatment system 311 to determine the treatment parameters per object.For example, the system can determine, due to its indexing andunderstanding of each object in a geographic scene, that in an immediateupcoming trial, a first object, if detected, should receive a treatmentof a first substance, but a similar second object, proximate to thefirst object, if detected, does not need to receive a treatment of thefirst substance, at least during the immediate upcoming trial. Furtherexamples will be provided below in this disclosure. Additionally, theuser interface 250 can display a visualization 360 of data related tofeatures, attributes, and characteristics of each object, or thespecific object selected. The information can include information of theobject relating to its size, color, shape, density, health, or otherinformation related to prediction information relating to yieldestimate, future size, shape, and health, and optimal harvest parametersof the specific object. Additionally, since the actions for treatingeach object, themselves can be sensed, indexed, and stored, a user canaccess each individual treatment action including its parameters such astype, volume, concentration, dwell time or surface contact diameter forfluid projectile treatments on each individual agricultural object orcrop throughout the life cycle of that specific individual crop orobject detected. This would allow a user to determine grow, health, andharvest parameters and data per crop or per object.

In one example, as illustrated in diagram 300 d of FIG. 3D, a user canaccess the user interface 350 and one or more interactive maps through avariety of devices. In one example, the electronic device can be atablet 380 having a user interface including user interface 350 andinteractive virtual map 382. The interactive virtual map 382 can be thatof the virtual maps discussed above. For example. The interactivevirtual map 382 can be that of a virtual map associated with a map of areal-world geographic scene having a plurality of agricultural objectsand landmarks. Because the geographic scene changes over a period oftime, multiple virtual maps can be generated to index each state of thegeographic scene, at a global scene level, such as the broadergeographic level including terrain, topography, trees, large objects,etc., and at a local level, such as that of each agricultural object,including target crop objects. Both the local scene comprising aplurality of agricultural objects and the global scene can be combinedto generate each virtual map. In one example, each virtual map of thesame real world geographic coordinates, or predetermined geofencedlocation, can be associated with each other such that an interactivechanging map can be displayed where one map changing or updating toanother map represents the changing state of the geographical scenechanging across a grow season. This can include plants sprouting, treesgrowing in size, or growing fruits. Each trial performed by theagricultural observation and treatment system 311 can include aplurality of sensor readings, including images captured from imagecapture devices that include 3d structure, location, depth, relativesize to objects in the real world, heatmap, etc., such that a virtualmap of the area sensed in the trial can be generated.

As more trials are performed, more of the geographic scene can bemapped, and thus used to generate a virtual map, or an index ofinformation associated with objects and landmarks of the geographicscene. For example, a first map can be generated to depict a firstgeographic scene captured at a first time with a first set ofcharacteristics, the characteristics including global characteristics,such as number of sprouts, number of trees, amount and color of visibledirt, topography, etc., and including local characteristics, such as percrop object of interest and each of its phenological stages, dependingon the type of geographic scene such as terrain, row crop farm land,orchard, etc., and a second map can be generated to depict the samefirst geographic scene captured at a second time having different globaland local characteristics.

The system can associate the first and second map maps such that thereis a logical link between the first and second map, indexed informationrelated to each of the first and second maps, the indexed and generatedmaps, or the generated interactive virtual maps, such that thegeographic scene having characteristics captured in the first map haveturned into the geographic scene, with the same or similar real-worldgeographic boundary, having characteristics captured in the second map.In one example, the system can generate a single map such that as thesystem performs more trials and senses and captures more of both theglobal and local portions of the geographic scene, and thus mapping moredetails and characteristic changes of the geographic scene from trial totrial, the system can update the same map into one more updated mapshaving updated global and local attributes and characteristics of thegeographic scene across time, instead of or in addition to generatingmultiple maps.

While the description above discussed virtual maps, the discussion canbe applied more generally to indexed information of geographic scenes,including geographic scenes of changing characteristics throughout time,stored in multiple forms and does not necessarily have to be a generatedvirtual map that can be visualized and displayed in a user interface.The real-world geographic scene can be sensed, and indexed in a databasehaving information relating to agricultural objects and landmarks of thegeographic scene with various sensor readings associated with eachagricultural object and landmark, including visual information, locationinformation, etc., such that the information stored in the database canbe used to generate a map. The information of each agricultural objectand landmark can also be used to generate a visualized virtual map thatcan be interfaced with a user on an electronic device.

In one example, the tablet 380 can display an interactive virtual map382 depicting, for example, the most updated map, or the most recentlygenerated map, of a mapped geographic boundary 383. The mappedgeographic boundary 383 can be the most recently captured and sensedstate of a real-world geographic region depicted in diagram 300 a ofFIG. 3A, having a plurality of agricultural objects and landmarkssensed, the agricultural objects being in their current state, andindexed, stored, and mapped as mapped geographic boundary 383. A priormapping, from a previous trial, on the same real-world geographic regioncan also be indexed, stored, and mapped and associated with mappedgeographic boundary 383. For example, agricultural object 370-a can bean individual blossom of an object detected by the agriculturalobservation and treatment system 311 captured in a recent trial. A user,via tablet 380, or any other electronic device, can interact withinteractive virtual map 382 to select a selectable object 370-a in thetablet to view information about object 370-a including any and allviews captured of object 370-a previously, time lapse video and timemachine video of object 370-a's history as it blossomed from a bud, forexample, treatment history, metadata, and crop characteristics of object370-a including prediction type information. The interactive virtual map382 displaying a mapped geographic boundary 383 can have a plurality ofselectable objects 320's to choose from. For example, object 371-a canbe a different object in the same geographic scene having a differenttreatment history as that of object 370-a. The image 352 can be aportion of a larger image captured by one or more image capture devicessuch as a 4K or 8K image frame, where image 352 is a cropped portion ofthe 4K or 8K image frame. Additionally, the image 352 can include morethan one even smaller patches of the image 352 of a specific object,such as image patch 352-1 of image 352 to display a view of virtual ordigitized object 370-a of some real-world object 302 in the real world,for example.

In one example, the selectable object in the virtual map 382 itself canbe an image. Because the virtual map 382 is interactive, the user canzoom in to the specific object in the virtual map 382 to view thespecific object inside interactive virtual map 382. The object zoomedinto can be an animated object depicting the specific object sensed andindexed from the real world, or can be an image patch, cropped from animage capture device, having a view of the object in the image patch.The objects and landmarks indexed in the virtual map 382, are associatedwith a location in the real world. Each animated agricultural object, orrepresentation of the agricultural object can include data representingat least one image captured by an image sensor of the agriculturalobject in the real world, a localization data representing the positionof the agricultural object relative to the geographic boundary itself,the position of the agricultural object relative to the agriculturalobservation and treatment system that captured an image of theindividual agricultural object, or its position relative to otheragricultural objects also with position data associated with theagricultural objects, as well as a timestamp of when the image andlocation data was acquired.

In one example, one or more agricultural object detected in thereal-world will change characteristics, for example phenological stagesor changes in size, such that the system 100 can detect a new feature ofthe agricultural object and assign a label or identifier to theagricultural object that had a different label or identifier previouslyassigned to the same agricultural object having the same or similarposition detected in the geographic boundary. This is due to a portionof a potential crop growing on a plant, for example a lateral, changingcharacteristics due to the growth stage of the plant. As a simplifiedexample, a fruiting tree can have buds on the tree's laterals which canturn into flowers, and then eventually a fruitlet, and then a fruit, forexample. Additionally, each of these features can be associated witheach other, particularly for labeled features of agricultural objectsthat have the same position detected in the real world, or similar imagefeatures from a previous trial of when the system 100 captured images ofthe specific agricultural object, or a combination thereof.

FIG. 3E illustrates a diagram 300 e depicting a user, or human 381,interacting in a real-world environment with an electronic device havinga user interface and interactive virtual map similar to that of the userinterfaces and interactive virtual maps discussed above. A user can havean electronic device with location and image sensing capabilities todetect a location of the device in the real world, the location of thedevice relative to an identified object, the identified object havinglocation data stored in the device or a location accessible wireless bythe device, or a combination thereof. As the user physically navigatesin the geographic boundary, such as an orchard, the user may come acrossone more indexed objects in the real world, that may be in plain view orviewable in real time by the electronic device, such as the tablet 380,a phone or smart device 385, or smart glasses 386, or mixed realitysmart glasses, or any other wearable or holdable device. In one example,the electronic device can be a drone controlled by the user in real timesuch as any drone free can be relayed and displayed in real time to theuser via a device the user is holding with an interactive interfaceincluding a screen.

For example, if the user is near agricultural object 370-a in the realworld, the user can access information stored about object 370-a in theelectronic device, including most recently views of the object,treatment history, or other information and metadata about the object,particularly those discussed above.

Additionally, an augmented reality or mixed reality environment can beaccessed via an electronic device such as a wearable with a display andimage sensors, a phone or table with image sensors, or a combinationthereof. As the user physically navigates the real-world geographicscene, the electronic device's image sensors can capture and detectobjects in its field of view. Each object previously detected, indexed,and stored can be displayed to the user in real time via augmentedreality or mixed reality, as the same objects in the real-world aredetected by the electronic devices in real time. The user can theninteract with the electronic device in a similar way described above. Inone example, an entire virtual map can be augmented on the real-worldgeographic scene so that the user can see information about every objectin the user, or electronic devices, field of view. In one example, avirtual reality environment can be generated such that a user, having avirtual reality device can navigate inside the virtual realityenvironment and interact with each agricultural object and landmarkdisplayed and created in the virtual reality environment. The user canview portions of the entire virtual reality environment changing acrosstime, either forward or backward. For example, a first virtual realityenvironment and scene depicting the geographic scene at a given time canchange, gradually or instantly, to a second virtual reality environmentdepicting the geographic scene at a different time. Each of the objectsand landmarks can also be selectable so the user can view specific viewscaptured of the objects at various times from different trials.

FIG. 4 illustrates a system architecture of an agricultural observationand treatment system, or agricultural treatment system 400, or treatmentsystem. The agricultural treatment system 400 can include a robot havinga plurality of computing, control, sensing, navigation, process, power,and network modules, configured to observe a plant, soil, agriculturalenvironment, treat a plant, soil, agricultural environment, or acombination thereof, such as treating a plant for growth, fertilizing,pollenating, protecting and treating its health, thinning, harvesting,or treating a plant for the removal of unwanted plants or organisms, orstopping growth on certain identified plants or portions of a plant, ora combination thereof. In one example, an agricultural observation andtreatment system, described in this disclosure, can be referred to as aportion of a system for observing and treating objects that is onboard amoving vehicle. Performances by the portion of the system onboard themoving vehicle, including computations, and physical actions, can beconsidered online performance or live performance. A portion of thesystem comprising one or more compute or storage components, that areconnected as a distributed system, can be considered the offline portionof the system configured to perform remote computing, serve as a userinterface, or storage. In one example, the agricultural observation andtreatment system is a distributed system, distributed via cloudcomputing, fog computing, edge computing, or a combination thereof, ormore than one subsystem is performing computations and actions live inaddition to the portion of the system onboard a moving vehicle.

The systems, robots, computer software and systems, applications usingcomputer vision and automation, or a combination thereof, can beimplemented using data science and data analysis, including machinelearning, deep learning including convolutional neural nets (“CNNs”),deep neural nets (“DNNs”), and other disciplines of computer-basedartificial intelligence, as well as computer-vision techniques used tocompare and correspond features or portions of one or more images,including 2D and 3D images, to facilitate detection, identification,classification, and treatment of individual agricultural objects,perform and implement visualization, mapping, pose of an agriculturalobject or of the robotic system, and/or navigation applications usingsimultaneous localization and mapping (SLAM) systems and algorithms,visual odometry systems and algorithms, including stereo visualodometry, or a combination thereof, receive and fuse sensor data withsensing technologies to provide perception, navigation, mapping,visualization, mobility, tracking, targeting, with sensing devicesincluding cameras, depth sensing cameras or other depth sensors, blackand white cameras, color cameras including RGB cameras, RGB-D cameras,infrared cameras, multispectral sensors, line scan cameras, area scancameras, rolling shutter and global shutter cameras, optoelectricsensors, photooptic sensors, light detection and ranging sensors (LiDAR)including spinning Lidar, flash LiDAR, static Lidar, etc., lasers, radarsensors, sonar sensors, radio sensors, ultrasonic sensors andrangefinders, other range sensors, photoelectric sensors, globalpositioning systems (GPS), inertial measurement units (IMU) includinggyroscopes, accelerometers, and magnetometers, or a combination thereof,speedometers, wheel odometry sensors and encoders, wind sensor, stereovision systems and multi-camera systems, omni-directional visionsystems, wired and wireless communications systems and networkcommunications systems including 5G wireless communications, computingsystems including on-board computing, mobile computing, edge computing,cloud and cloudlet computing, fog computing, and other centralized anddecentralized computing systems and methods, as well as vehicle andautonomous vehicle technologies including associated mechanical,electrical and electronic hardware. The systems, robots, computersoftware and systems, applications using computer vision and automation,or a combination thereof, described above, can be applied, for example,among objects in a geographic boundary to observe, identify, index withtimestamps and history, and/or apply any number of treatments toobjects, and, more specifically, of an agricultural delivery systemconfigured to observe, identify, index, and/or apply, for example, anagricultural treatment to an identified agricultural object based on itslocation in the real-world geographic boundary, growth stage, and anyand all treatment history.

In this example, the agricultural treatment system 400 agriculturaltreatment system 400 can include an on-board computing unit 420, suchcompute unit 420 computing unit embedded with a system on chip. Theon-board computing unit can include a compute module 424 configured toprocess images, send and receive instructions from and to variouscomponents on-board a vehicle supporting the agricultural treatmentsystem 400 agricultural treatment system 400. The computing unit canalso include an engine control unit 422, a system user interface, systemUI 428, and a communications module 426.

The ECU 422 can be configured to control, manage, and regulate variouselectrical components related to sensing and environment that theagricultural treatment system 400 will maneuver in, electricalcomponents related to orienting the physical components of theagricultural treatment system 400, moving the agricultural treatmentsystem 400, and other signals related to managing power and theactivation of electrical components in the treatment system. The ECU 422can also be configured to synchronize the activation and deactivation ofcertain components of the agricultural treatment system 400 such asactivating and deactivating the illumination module 460, and synchronizethe illumination module 460 with one or more cameras of the cameramodule 450 or one or more other sensors of the sensing module 451 forsensing an agricultural scene for observation and treatment ofagricultural objects.

The compute module 424 can include computing devices and componentsconfigured to receive and process image data from image sensors or othercomponents. In this example, the compute module 424 can process images,compare images, identify, locate, and classify features in the imagesincluding classification of objects such as agricultural objects,landmarks, or scenes, as well as identify location, pose estimation, orboth, of an object in the real world based on the calculations anddeterminations generated by compute module 424 on the images and othersensor data fused with the image data. The communications module 426, aswell as any telemetry modules on the computing unit, can be configuredto receive and transmit data, including sensing signals, renderedimages, indexed images, classifications of objects within images, datarelated to navigation and location, videos, agricultural data includingcrop yield estimation, crop health, cluster count, amount of pollinationrequired, crop status, size, color, density, etc., and processed eitheron a computer or computing device on-board the vehicle, such as one ormore computing devices or components for the compute module 424, orremotely from a remote device close to the device on-board the vehicleor at a distance farther away from the agricultural scene or environmentthat the agricultural treatment system 400 maneuvers on.

For example, the communications module 426 can communicate signals,through a network 520 such as a wired network, wireless network,Bluetooth network, wireless network under 5G wireless standardstechnology, radio, cellular, etc. to edge and cloud computing devicesincluding a mobile device 540, a device for remote computing of dataincluding remote computing 530, databases storing image and other sensordata of crops such as crop plot repository 570, or other databasesstoring information related to agricultural objects, scenes,environments, images and videos related to agricultural objects andterrain, training data for machine learning algorithms, raw datacaptured by image capture devices or other sensing devices, processeddata such as a repository of indexed images of agricultural objects. Inthis example, the mobile device 540 can control the agriculturaltreatment system 400 through the communications module 426 as well asreceive sensing signals from the telemetry module 366. The mobile device540 can also process images and store the processed images in thedatabases 560 or crop plot repository 570, or back onto the on-boardcomputing system of agricultural treatment system 400. In one example,remote computing 530 component can be one or more computing devicesdedicated to process images and sensing signals and storing them,transferring the processed information to the database 560, or back tothe on-board computing device of agricultural treatment system 400through the network 520.

In one example, the agricultural treatment system 400 includes anavigation unit 430 with sensors 432. The navigation unit 430 can beconfigured to identify a pose and location of the agricultural treatmentsystem 400, including determining the planned direction and speed ofmotion of the agricultural treatment system 400 in real time. Thenavigation unit 430 can receive sensing signals from the sensors 432. Inthis example, the sensing signals can include images received fromcameras or Lidar's. The images received can be used to generate a gridmap in 2D or 3D based on simultaneous visualization and mapping (SLAM)including geometric SLAM and Spatial SLAM techniques, visual odometry,or both, of the terrain, ground scene, agricultural environment such asa farm, etc. The sensing signals from the sensors 432 can also includedepth signals from depth sensing cameras including RGB-D cameras orinfrared cameras, or calculated with stereo vision mounted sensors suchas stereo vision cameras, as well as other signals from radar, radio,sonar signals, photoelectric and photooptic signals, as well as locationsensing signals, from having a global positioning system (GPS) unit,encoders for wheel odometry, IMU's, speedometers, etc. A compute module434, having computing components such as a system on chip or othercomputing device, of the navigation unit 430, or compute module 424 ofthe compute unit 420, or both, can fuse the sensing signals received bythe sensors 432, and determine a plan of motion, such as to speed up,slow down, move laterally, turn, change the rocker orientation andsuspension, move, stop, or a combination thereof, or other location,pose, and orientation-based calculations and applications to align atreatment unit 470 with the ground, particularly with an object ofinterest such as a target plant on the ground. In one example, thenavigation unit 430 can also receive the sensing signals and navigateagricultural treatment system 400 autonomously. For example, anautonomous drive system 440 can include motion components including adrive unit 444 having motors, steering components, and other componentsfor driving a vehicle, as well as motion controls 442 for receivinginstructions from the compute module 424 or compute module 424, or both,to control the drive unit and move the vehicle, autonomously, from onelocation and orientation to a desired location and orientation.

In one example, the navigation unit 430 can include a communicationsmodule 436 to send and receive signals from other components of theagricultural treatment system 400 such as with the compute unit 420 orto send and receive signals from other computing devices and databasesoff the vehicle including remote computing devices over the network 520.

In another example, the navigation unit 430 can receive sensing signalsfrom a plurality of sensors including one or more cameras, Lidar, GPS,IMUs, VO cameras, SLAM sensing devices such as cameras and LiDAR,lasers, rangefinders, sonar, etc., and other sensors for detecting andidentifying a scene, localizing the agricultural treatment system 400and treatment unit 470 onto the scene, and calculating and determining adistance between the treatment unit 470 and a real world agriculturalobject based on the signals received, fused, and processed by thenavigation unit 430, or sent by the navigation unit 430 to be processedby the compute module 424, and/or another on-board computing device ofthe treatment system 900. The images received can be used to generate amap in 2D or 3D based on SLAM, visual odometry including geometry basedor learning based visual odometry, or both, of the terrain, groundscene, agricultural environment such as a farm, etc. The sensing signalscan also include depth signals, from having depth sensing camerasincluding RGB-D cameras or infrared cameras, a radar, radio, sonarsignals, photoelectric and photooptic signals, as well as locationsensing signals from GPS, encoders for wheel odometry, IMUs,speedometers, and other sensors for determining localization, mapping,and position of the agricultural treatment system 400 to objects ofinterest in the local environment as well as to the regionalagricultural environment such as a farm or other cultivated land thathas a designated boundary, world environment, or a combination thereof.The navigation unit 430 can fuse the sensing signals received by thesensors, and determine a plan of motion, such as to speed up, slow down,move laterally, turn, move, stop, change roll, pitch, and/or yaworientation, or a combination thereof, or other location, localization,pose, and orientation-based calculations and applications.

In one example, the navigation unit 430 can include a topography moduleconfigured to utilize sensors, computer components, and circuitryconfigured to detect uneven surfaces on a plane or scene of the terrainwhich allows the topography module to communicate with the rest of thecomponents of the treatment system to anticipate, adjust, avoid,compensate for, and other means of allowing the agricultural treatmentsystem 400 to be aware of uneven surfaces detected on the terrain aswell as identify and map unique uneven surfaces on the terrain tolocalize the vehicle supporting the navigation unit 430.

In one example, the agricultural treatment system 400 includes a cameramodule 450 having one or more cameras, sensing module 451 having othersensing devices, or both, for receiving image data or other sensing dataof a ground, terrain, orchard, crops, trees, plants, or a combinationthereof, for identifying agricultural objects, such as flowers, fruits,fruitlets, buds, branches, plant petals and leaves, plant pistils andstigma, plant roots, or other subcomponent of a plant, and the location,position, and pose of the agricultural objects relative to a treatmentunit 470, camera module 450, or both, and its position on the ground orterrain. The cameras can be oriented to have a stereo vision such as apair of color or black and white cameras oriented to point to theground. Other sensors of sensing module 451 can be pointed to the groundor trees of an orchard for identifying, analyzing, and localizingagricultural objects on the terrain or farm in parallel with the camerasof the camera module 450 and can include depth sensing cameras, LiDAR's,radar, electrooptical sensors, lasers, etc.

In one example, the agricultural treatment system 400 can include atreatment unit 470 with a treatment head 472. In this example, thetreatment unit 470 can be configured to receive instructions to pointand shine a laser, through the treatment head 472, to treat a targetposition and location on the ground terrain relative to the treatmentunit 470.

The agricultural treatment system 400 can also include motion controls442, including one or more computing devices, components, circuitry, andcontrollers configured to control mechatronics and electronic componentsof a vehicle supporting the agricultural treatment system 400 configuredto move and maneuver the agricultural treatment system 400 through aterrain or orchard having crops and other plants of interest such that,as the agricultural treatment system 400 maneuvers through the terrain,the cameras 350 are scanning through the terrain and capturing imagesand the treatment unit is treating unwanted plants identified in theimages captured from the camera module 450 and other sensors fromsensing module 451. In one example, an unwanted plant can be a weed thatis undesirable for growing next or near a desirable plant such as atarget crop or crop of interest. In one example, an unwanted plant canbe a crop that is intentionally targeted for removal or blocking growthso that each crop growing on a specific plant or tree can be controlledand nutrients pulled from the plant can be distributed to the remainingcrops in a controlled manner.

The agricultural treatment system 400 can also include one or morebatteries 490 and one or configured to power the electronic componentsof the agricultural treatment system 400, including DC-to-DC convertersto apply desired power from the battery 490 to each electronic componentpowered directly by the battery.

In one example, the illumination module 460 can include one or morelight arrays of lights, such as LED lights. The one or more light arrayscan be positioned near the one or more cameras or sensors of cameramodule 450 and sensor module 451 to provide artificial illumination forcapturing bright images. The light arrays can be positioned to pointradially, from a side of the vehicle, pointed parallel to the ground,and illuminate trees or other plants that grow upwards. The light arrayscan also be positioned to be pointed down at the ground to illuminateplants on the ground such as row crops, or other plants or soil itself.The light arrays can be controlled by the ECU 422, as well as by asynchronization module, embedded in the ECU 422 or a separate electroniccomponent or module, such that the lights only flashes to peak power andluminosity for the length of 1 frame of the camera of camera module 450,with a matched shutter speed. In one example, the lights can beconfigured by the ECU 422 to flash to peak power for the time length ofa multiple of the shutter speed of the camera. In one example, thelights of the light array can be synchronized to the cameras with a timeoffset such that the instructions to activate the LED's of the lightarray and the instructions to turn on the camera and capture images areoffset by a set time, predetermined time, or automatically calculatedtime based on errors and offsets detected by the compute unit 420, sothat when the LED's actually activate to peak power or desiredluminosity, which will be a moment in time after the moment in time theECU sends a signal to activate the light array, the camera will alsoactivate at the same time and capture its first image, and then both thelights and cameras will be synchronized and run at the same frequency.In one example, the length of time of the peak power of the activatedlight is matched and synchronized with the exposure time of each framecaptured of the camera, or a multiple of the exposure time. In oneexample, the cameras can include cameras having different resolutionand/or frame capture rates.

For example, the lights of the light array can flash with turning on,reach peak power, and turn off at a rate of 30 to 1000 Hertz (Hz). Inone example, the lights can flash at 240 Hz to match one or more camerasthat has a rolling shutter speed, global shutter speed, or both, of 240Hz. In one example, the lights can flash at 240 Hz to match one or morecameras that has a rolling shutter speed, global shutter speed, or both,of 30 or 60 Hz. In one example, the lights can reach a peak power of 2.0M Lumen with a sustained peak power ON for 250 microseconds with a dutycycle of less than 10%. In one example, the color temperature of thelight 170 can include the full spectrum of white light including cool,warm, neutral, cloudy, etc. In one example, the color temperature of thelight can be around 5000K nm to reflect and artificially imitate thecolor temperature of the Sun.

In one example, the agricultural treatment system 400 can include atreatment unit 470 with a treatment head 472. In this example, thetreatment unit 470 can include a turret and circuitry, electroniccomponents and computing devices, such as one or more microcontrollers,electronic control units, FPGA, ASIC, system on chip, or other computingdevices, configured to receive instructions to point and a treatmenthead 472, to treat a surface of a real-world object in proximity of thetreatment unit 470. For example, the treatment unit 470 can emit a fluidprojectile of a treatment chemical onto an agricultural object in thereal world based on detecting the agricultural object in an imagecaptured and determining its location in the real world relative to thetreatment unit 470.

The treatment unit 470 can include a gimbal assembly, such that thetreatment head 472 can be embedded in, or supported by the gimbalassembly, effectively allowing the treatment head 472 to rotate itselfand orient itself about one or more rotational axes. For example, thegimbal assembly can have a first gimbal axis, and a second gimbal axis,the first gimbal axis allowing the gimbal to rotate about a yaw axis,and the second gimbal axis allowing the gimbal to rotate about a pitchaxis. In this example, a control module of the treatment unit cancontrol the gimbal assembly which changes the rotation of the gimbalassembly about its first gimbal axis, second gimbal axis, or both. Thecompute module 424 can determine a location on the ground scene,terrain, or tree in an orchard, or other agricultural environment, andinstruct the control module of the treatment unit 470 to rotate andorient the gimbal assembly of the treatment unit 470. In one example,the compute module 424 can determine a position and orientation for thegimbal assembly to position and orient the treatment head 472 in realtime and make adjustments in the position and orientation of thetreatment head 472 as the agricultural treatment system 400 is movingrelative to any target plants or agricultural objects of interest on theground either in a fixed position on the ground, or is also moving. Theagricultural treatment system 400 can lock the treatment unit 470, atthe treatment head 472, onto the target plant, or other agriculturalobject of interest through instructions received and controls performedby the control module of the treatment unit 470, to adjust the gimbalassembly to move, or keep and adjust, in real time, the line of sight ofthe treatment head 472 onto the target plant.

In one example, a chemical selection module, or chemical selection 480,of agricultural treatment system 400 agricultural treatment system 400can be coupled to the compute module 424 and the treatment unit 470. Thechemical selection module can be configured to receive instructions tosend a chemical fluid or gas to the treatment unit 470 for treating atarget plant or other object. In this example, the chemical selectionmodule can include one or more chemical tanks 482, one or more chemicalregulators 484 operable connected to the one or more chemical tanks 484such that there is one chemical regulator for tank, a pump for eachtank, and a chemical mixer 488 which can mix, in real time, chemicalmixtures received from each chemical tank selected by the chemical mixer488. In one example, a vehicle supporting the agricultural treatmentsystem 400 agricultural treatment system 400, including the chemicalselection module 480, can support one chemical tank 482, a chemicalpump, a chemical regulator 486, a chemical and a chemical accumulator,in series, linking connecting a pathway for a desired chemical or liquidto travel from a stored state in a tank to the treatment unit 470 fordeposition on a surface of an object. The chemical regulator 484 can beused to regulate flow and pressure of the fluid as it travels from thepump to the treatment unit. The regulator 484 can be manually set by auser and physically configure the regulator on the vehicle, orcontrolled by the compute unit 420 at the compute module 424 or ECU 422.The chemical regulator 484 can also automatically adjust flow andpressure of the fluid from the pump to the treatment unit 470 dependingon the treatment parameters set, calculated, desired, or a combinationthereof. In one example, the pump can be set to move fluid from thestorage tank to the next module, component, in the series of componentsfrom the chemical tank 482 to the treatment unit 470. The pump can beset at a constant pressure that is always pressurized when the vehicleand agricultural treatment system 400 agricultural treatment system 400is currently running a trial for plant or soil treatment. The pressurecan then be regulated to controlled from the constant pressure at theregulator, and also an accumulator 487, so that a computer does not needto change the pump pressure in real time. Utilizing a regulator andaccumulator can cause the pressure needed for the spray or emission of afluid projectile to be precisely controlled, rather than controllingvoltage or power of the pump. In one example, the agricultural treatmentsystem 400 agricultural treatment system 400 will identify a targetplant to spray in the real world based on image analysis of the targetplant identified in an image captured in real time. The compute unit 420can calculate a direction, orientation, and pressurization of thetreatment unit 470 such that when the treatment unit 470 activates andopens a valve for the pressurized liquid to pass from the chemicalselection module 480 to the treatment unit 470, a fluid projectile of adesired direction, orientation, and magnitude, from the pressure, willbe emitted from the treatment unit 470 at the treatment head 472. Thepump will keep the liquid stream from the chemical tank 482 to thetreatment unit 470 at a constant pressure, whether or not there is flow.The chemical regulator 484 in the series of components will adjust andstep down the pressure to a desired pressure controlled manually beforea trial, controlled by the compute unit 420 before the trial, orcontrolled and changed in real time during a trial by the compute unit420 either from remote commands from a user or automatically calculatedby the compute module 424. The accumulator 487 will keep the liquidstream in series pressurized to the desired pressure adjusted andcontrolled by the chemical regulator 484, even after the treatment unit470 releases and emits pressurized fluid so that the stream of fluidfrom the pump to the treatment unit 470 is always kept at a desiredpressure without pressure drops from the release of pressurized fluid.

In one example, the chemical can be a solution of different chemicalmixtures for treating a plant or soil. The chemicals can be mixed, orpremixed, configured, and used as pesticides, herbicides, fungicides,insecticides, fungicides, adjuvants, growth enhancers, agents,artificial pollination, pheromones, etc., or a combination thereof. Inone example, water or vapor can be substituted for any of the fluid orchemical selections described above. In one example, the agriculturaltreatment system 400 agricultural treatment system 400 can apply powdersprays or projectiles as well as foams, gels, coatings, or otherphysical substances that can be emitted from a chemical spray device.

In one example, the treatment unit 470 can emit a projectile, liquid,gas, aerosol, spray, mist, fog, or other type of fluid droplet inducedspray to treat a plurality of different plants in real time. Anagricultural scene can include a row crop farm or orchard planted withdifferent crops. In this example, each row of plants can include adifferent type of plant to by cultivated and treated such that thetreatment unit 470 can treat one row with one type of treatment, such asa chemical mixture-1, mixed and sent to the treatment unit 470 by thechemical mixer 488, and another row with another type of treatment to adifferent crop or plant, such as a chemical mixture-2. This can be donein one trial run by a vehicle supporting the chemicals, and treatmentsystem with treatment unit 470. In another example, each row itself, ina row crop farm or orchard, can have a plurality of different type ofcrops. For example, a first row can include a first plant and a secondplant, such that the first plant and second plant are planted in analternating pattern of a first plant, a second plant, a first plant, asecond plant, and so forth for the entire row of a first row. In thisexample, the chemical selector 488 and treatment unit 470 can deposit afirst chemical mixture projectile, for precision treatment, to the firstplant, and deposit a second chemical mixture projectile, for precisiontreatment, to the second plant, in real time, and back to the depositingthe first chemical projectile to the third plant in the row of crops,the third plant being of the same plant type as the first plant, and soforth. In one example, a plurality of more than two types or species ofplants can be planted in tilled soil, and be grown and treated in a rowcrop with the agricultural treatment system 400

In one example, the treatment unit of agricultural treatment system 400can blast water or air, or a water vapor to one or more agriculturalobjects to wash off any undesired objects detected on the surface orother portion of the agricultural objects. The undesired objects can beunwanted bugs or debris on the agricultural object as well as previouslyapplied chemicals that are no longer desired to leave on theagricultural object. In one example, the treatment unit can then recoatan agricultural object that was previously cleaned with water or airwith a new chemical treatment. In one example, one of the chemical tankscan also include water for the purposes of purging the stream of liquidfrom tanks to the treatment units of any excess chemical or substancebuildup which could affect chemical composition, pressure, spray health,and other controlled factors that could affect desired performance. Inone example, one of the tanks can include water for chemigation as wateris mixed with substance from a different tank.

FIG. 5 is a diagram 600 illustrating an example vehicle 610 supportingan example observation and treatment system, or treatment system 612,performing in a geographic boundary, according to some examples. In thisexample, the vehicle 610 can support one or more modular treatmentsystems 612. The treatment systems 612 can be similar to that ofagricultural observation and treatment systems described above. Forexample, a system can include onboard and offline components performingtasks both in real time while a vehicle supporting the onboard portionsof components are performing observations and actions and at edgecompute device or remotely both in real time or offline.

For example, the treatment system 612 can be one of a plurality ofmodular component treatment systems, each component treatment system caninclude one or more sensors including image capture sensors,illumination devices, one or more treatment units, for example a pair oftreatment units each with a treatment head capable of aiming at a target660 with at least 2 degrees of rotational freedom, a compute unitconfigured to send and receive instructions of sensors, encoders, andactuators and connected and associated with the component treatmentsystem and the compute unit to time sync all of the components, andother electronics to sync and communicate with other compute units ofother component treatment systems. Each of these treatment systems 612can receive treatment fluids from a common pressurized source of fluid,or each treatment unit is connected to different sources of fluid. Thecomponent treatment systems are configured to sense targets 660 in realtime while supported by the moving vehicle 612, determine what kind oftreatment, or other action, to perform on to a surface of the target660, target and track the target 660, predict performance metrics of theinstructed parameters of the action including projectile location,perform the action, including emitting a fluid projectile or lightsource, and evaluating the efficacy and accuracy of the action.

Additionally, each of these treatment systems 612, or componenttreatment systems, can communicate and receive information from acomponent navigation system or navigation unit which is configured tosense a global scene such that each of the compute units of each of thecomponent treatment systems can sense its local environment from sensorsoperably connected to the compute unit of the component treatmentsystem, or embedded in the component treatment system, as well asreceive information about the global environment in a geographic scenefrom sensors and analysis performed by sensors and the one or morecompute units of the navigation unit connect to each of the localcomponent treatment systems.

The vehicle 610 can be operating in a geographic region such as a farmor orchard. A portion of the geographic boundary, illustrated in FIG. 5, with one or more trees 634 is shown. In this example the vehicle 610can be operating in an orchard with multiple rows of trees, each havinga trunk 636, or other plants for the treatment systems 612 to observeand treat. In one example, the vehicle can be travelling in a straightline along a row of trees and crops on both sides of the vehicle.

One or more treatment systems 612 can be mounted on top, embedded in,suspended underneath, towed, or oriented in many ways securely onto thevehicle such that the treatment system 612 can be configured andoriented to scan a row of crops or plants or other agricultural scenesin a line while the vehicle 610 is moving.

The vehicle 610 may include functionalities and/or structures of anymotorized vehicle, including those powered by electric motors orinternal combustion engines. For example, vehicle 610 may includefunctionalities and/or structures of a truck, such as a pick-up truck(or any other truck), an all-terrain vehicle (“ATV”), a utility taskvehicle (“UTV”), or any multipurpose off-highway vehicle, including anyagricultural vehicle, including tractors or the like. The treatmentsystems 612 that may be powered or pulled separately by a vehicle, whichmay navigate paths autonomously in the geographic boundary.

In one example, a geographic boundary can be configured to have two rowsof plants on each side of a single lane for a vehicle to navigatethrough. On each side of the vehicle will be vertically growing plantssuch as trees. The treatment system 612 can be mounted on the vehicle ina way that image sensors of the treatment system 612 are pointingdirectly at the trees on each two left and right side of the vehicle. Asthe vehicle operates along a lane or path in the orchard, the treatmentsystem 612 can capture a series of images from one side to another ofthe row of plants as well as treat each agricultural object with aprecision treatment.

FIG. 6 illustrates a diagram 700 depicting an agricultural observationand treatment system supported by a vehicle navigating in anagricultural environment. The agricultural environment can be a farm ororchard typically having one or more plants such as trees 303. While theillustration depicts a system in an environment similar to that of anorchard, different, the description below can apply to a system, thesame system, performing in multiple different types of environments suchas row crop farms where portions of the system include sensors pointingat the ground to detect row crop objects of interest.

The agricultural observation and treatment system can be a modularsystem similar to that of agricultural observation and treatment system311 supported by vehicle 310. The system 311 can also include varioussensors such as imaged based sensor 313, lidar based sensors 314, or aplurality of non-vision-based sensors as described previously. Similarto that of navigation unis and navigation modules described in thisdisclosure, the treatment system depicted in diagram 700 can usesensors, such sensor 313 and 314 to perform global registry of theagricultural environment as well as perform localization and poseestimation of the vehicle or portions of the vehicle in a global sceneor from a global point of reference. This can include receiving sensordata and generating and building high definition 2-dimensional and3-dimensional maps, or global maps as opposed to views of individualcrops with sensors close up to the individual crops which can bereferred to a local scene or local registry of a geographic boundary orscene, of the agricultural environment in real time.

In one example, the agricultural observation and treatment system 311,described in previous discussions, can be configured to perform sceneunderstanding, mapping, and navigation related analysis includinglocalization of the vehicle and/or components of the treatment systemand pose estimation of the vehicle and/or components of the treatmentsystem, for example pose estimation of individual image capture devicesembedded in each component spray system, each component spray systemillustrated in diagrams 900, 902, 2406, and 2408, or each modularsubsystem of the agricultural observation and treatment. The sensing canbe performed with a various suite of image-based sensors, motion-basedsensors, navigation-based sensors, encoder sensors, other sensors, or acombination thereof, fused and synchronized together such that at leastcomponents of the agricultural observation and treatment system 311 candetermine navigational properties of an environment the system is in,including pose and pose estimation of the components in real time as thevehicle, treatment system, and components of the system moves andnavigates in such environment.

In one example, the agricultural observation and treatment system 311can perform mapping of a scene and localizing the treatment system inthe scene. This can include mapping a scene with a global frame ofreference or point of origin in a given coordinate system, anddetermining its location relative to a point in the mapped scene,particularly a point of origin in the scene. For example, when a vehicleis navigating in an agricultural environment in FIG. 6 , the system canarbitrarily determine a point of origin in the agricultural environment,for example the portion of the agricultural environment the system hassensed thus far, or preloaded into the system before sensing theenvironment in real time. For example, a first corner or first edge of aportion or region of a farm or orchard. As the vehicle navigates in theenvironment, the system can determine where the vehicle and each of thecomponents of the system, due to the components being fixed relative tothe vehicle, has moved across time. The treatment system's navigationtype sensors, including GPS, IMU, encoders, image capture devicesconfigured to capture a high resolution or low resolution of a globalscene and perform techniques and functions in computer vision andmachine learning such as visual SLAM and visual odometry (the globalscene referring to the farm or orchard, or any kind of agriculturalenvironment itself, and not necessarily those sensors pointing directlyat individual plants or crops of plants for high-definition images ofindividual plants), Lidars to sense a global scene similar to that ofthe image capture devices, optoelectrical sensors, ultrasonic sensors,radar, sonar, and other image capture devices such as NIR cameras, RGB-Dcameras, multispectral cameras, etc., configured to obtain globalregistry of an environment including mapping the environment at a globallevel, and can be used to generate and continuously generate a globalpose estimation of the vehicle as it moves along a path, and each of itscomponent, relative to a point of origin in the geographic environment,the system can also determine the same global pose of components of thesystem as well, due to the components being rigidly connected orsupported by the vehicle. For example, while a camera sensor that is 15feet from the ground or 15 feet vertically above a bed of the vehiclemay wobble and move more than that of a camera sensor that is 1 footfrom the ground or 1 feet vertically above the bed of the vehicle, eachposes of the cameras may be different to each other at a local level,relative to a vehicle, and therefore, relative to a point of origin inthe geographic scene, the global pose estimation can be estimated tothat of the global pose estimation generated for the vehicle by anavigation module and sensors associated with navigation. This isbecause each of these cameras can be rigidly connected to the sensorsconfigured to perform global registry of the environment, such thatphysical translation and movement of the vehicle, and particularlymovement of the navigation-based sensors (for example, GPS, andnavigation-based camera sensors), the local sensors embedded orsupported by each component spray module or component treatment modulewill substantially move the same amount. Additionally, each sensorsupported by each component treatment module can track local objects,shapes, patterns, or any salient points or patterns local to each of thecomponent treatment modules such that a more accurate pose estimationfor each of the component treatment modules, more specifically, poseestimation of sensors of each component treatment modules, can begenerated and continuously generated as the component treatment modulesare scanning a local scene to observe objects and perform objects ontarget objects.

In one example, the system, both the navigation system and itscomponents, sensors, and compute units, as well as each componentsubsystem or component treatment module having its own components,sensors, treatment units, and compute units, can use techniquesassociate with simultaneous localization and mapping (SLAM) andodometry, particularly visual SLAM, VSLAM, and visual odometry, or VIO,in conjunction with other non-visual based navigation and localizationanalysis, fused together in real time with sensor fusion andsynchronization, to perform pose estimation of the vehicle.Additionally, each modular sub systems of the treatment system includingeach modular spray subsystem, for example each modular spray subsystemor component treatment module including a structural mechanism, acompute unit, one or more sensors, one or more treatment units, and oneor more illumination devices, can perform VSLAM and receive othernon-visual based sensor readings, and continuously generate its ownlocalized pose estimation, the pose being relative to specific objectsdetected by each of the component treatment modules, which can includeagricultural objects including target objects or nearby objects orpatterns, shapes, points, or a combination thereof that are of a similarsize to that of the target objects. The pose estimation of components ofeach of the component treatment modules will be relative to the locationof the objects and patterns detected to be tracked across time andacross sensors in stereo for stereo matching points for depthperception. Additionally, the system can perform projection andreprojection, and determine reprojection error, for more accuratelydetermining location of objects and eliminating outliers. Thus,detecting objects and patterns that are known to be fixed in space, forexample a ground terrain with unique rocks or dirt patterns, orindividual plants, and calculating and identifying the objects' orpatterns' 3D location and/or orientation relative to the sensors' 3Dlocation and/or orientation sensing the objects and patterns allows thesystem to understand navigation, localization, and more specificallylocal pose estimation of each of those sensors relative to the objectsdetected. Additionally, since the orientation of the treatment units,and its treatment heads, are in close proximity to the individualcomponent treatment module, and rigidly attached and connected to astructure of the component treatment module (illustrated in at leastFIGS. 9A, 9B, and 21 ), and also in close proximity and rigidlyconnected to the sensors associated with that particular componenttreatment module and compute unit, the location and orientation of thetreatment head of the treatment units (the treatment heads havingencoders to determine line of sight relative to the body of thetreatment unit) can also be continuously generated and determinedrelative to the target objects or objects near the target objectsthemselves for better accuracy of treatment.

In this disclosure, while the determined pose estimation can be referredto the pose estimation generated for the vehicle or a component modularspray subsystem, a pose estimation can be determined, using VSLAM, VIO,and/or other sensor analysis, to generate a pose, including a locationand/or orientation for any component of the vehicle or component of theagricultural observation and treatment system. In one example, a poseestimation can be referred to and generated with coordinates, forexample (x₁, y₁, z₁, Φ₁, θ₁, Ψ₁) with x, y, z, being the translationallocation relative to an origin (x₀, y₀, z₀) and starting orientation(Φ₀, θ₀, Ψ₀) of the component relative to an origin point and/ororientation, of any component or portion of a component.

In one example, as illustrated in FIG. 6 , a vehicle's navigation modulecan include sensors such as imaged based sensor 313, lidar based sensors314, or a plurality of non-vision-based sensors as described previously,fused together to obtain global registry of the scene for mapping thescene as well as in real time navigating in the scene. For example, apair or more than one pair of image sensors 313, in stereo, can bemounted on the vehicle such that the sensors are pointed out to the realworld to observe a global scene, as opposed to down on the ground toobserve individual plants or right in front of (a few meters or feweraway) a tree to observe individual plants, crops, or other agriculturalobjects. Additionally, Lidars, radar, sonar, and other sensors can bemounted in a similar location as that of the cameras to register aglobal scene.

The agricultural observation and treatment system, at the navigationunit, or a component of the navigation unit, can perform real time VSLAMand VIO to simultaneously map the agricultural environment that it isin, as well as understand the location, localizing, of the agriculturalobservation and treatment system itself as it is navigating in theenvironment, whether it is autonomously navigating or navigating with ahuman driver on the vehicle or remotely. In one example, as illustratedin diagram 700 of FIG. 6 , VSLAM can be performed using keypointdetection and matching across stereo and across time, or surfacematching of salient points or regions of surfaces detected. Keypoints,for example keypoint 706 can be generated in real time from image framescaptured by image sensors. In one example, stereo image sensors cancapture frames at the same time. Keypoints can be generated andidentified by a compute unit embedded in the image sensor, or a computeunit of the navigation unit operably connected to each of the stereoimage sensors configured to receive images or imagery data from thesensors. Common keypoints are matched such that the system canunderstand depth of the keypoint from the stereo sensors, and thus thedepth of the keypoint from the navigation unit and therefore thevehicle. Thousands of keypoints can be detected, generated, and trackedover time per frame. They keypoints themselves can be cluster of pixelpoints representing a corner, blob, line, other salient patterns. Thesepoints do not necessarily have to be real world identifiable knownobjects. For example, a keypoint can be generated as a where two objectsmeet, or where one object and a background meet, for example an edge ofa leaf from the background sky. The different in color between a leafand a sky will create a line or edge between the two colors captured byan image sensor. Other examples can be corners of objects, dots, orlines. For example, three small rocks next to each other can formsalient pattern to track, even if the system doesn't understand that itis a 3-rock cluster, meaning the system cannot extract enough featuresto perform object detection and determine that one or more rocks wereidentified. But the actual 3 rocks form a rigid and complex pattern thatthe system can still identify and track.

The system may not be able to detect an identity of an object, but itcan detect its contours and edges and track those points. In thisexample any type of points or pixels, clusters of points or pixels,lines, corners, that the system may determine as salient points orpatters, can be generated as a keypoint. The keypoint may or may not begenerated in a different frame captured by a different image sensor, forexample a second sensor in stereo with a first sensor. Common keypointsbetween different frames can be matched using various computer visiontechniques such as image correspondence, keypoint matching, templatematching where the templates are patches of image including keypoints,dense SLAM techniques, techniques with classifiers, RANSAC and outlierrejection to more accurately detect common keypoints, other statisticalmodelling techniques, or other corner, line, blob, edge detectiontechniques including FAST, SIFT, Harris corner detection, Lucas-Kanadetracking, SURF, NERF, ORB, or a combination thereof. Additionally,lines, corners, patterns, or other shapes, whether referred to askeypoints or not, that are generated in each frame can be matched to thesimilar lines, corners, patterns, and shapes using the same or similartechniques described above.

The keypoints that are matched between two cameras in stereo, or morecameras can be used to determine depth of the object or point or pointsin the real-world space associated with the keypoints detected. Thesetechniques can be applied similarly with Lidar or used in conjunctionwith lidar sensed point clouds and fused together for more accuraterepresentations of a scene. The images sense, and keypoints matchedbetween cameras can be used to perform dense reconstructions ofenvironments, as well as perform real time navigation, localization, andpose estimation of the sensors sensing the environment.

In one example, dense visual slam can be performed, whether by a singleimage sensor, such as a camera, infrared camera, rgb-d camera,multispectral sensors, or other sensor, or by stereo cameras, or two ormore of different types of cameras that are synchronized with knownfixed distance and orientation relative to each other, to compare imagefrom frame to frame across time. The commonly matched and tracked pointscan allow the sensor to determine how much movement and amount oftranslation and orientation change of the sensor based on the shape anddepth of the point or object detected and tracked from a previous frame.For example, an agricultural observation and treatment system can sensea point in space, which can be any type of pattern, but for example, canbe a pattern of a base of a tree trunk. The system can determine that asa keypoint 706. As the vehicle supporting the agricultural observationand treatment system moves closer to the base of the tree trunk. Thepattern of the keypoint 706 from a first frame will change in shape,location, size, etc., in a subsequent frame captured at a later point asthe image sensor has moved closer to the point. The system would be ableto calculate the amount of movement in space as it tracks the same pointin space with keypoint matching or other techniques to perform VSLAM.

In one example, VSLAM can be performed by detecting objects themselves,rather than arbitrary points and patterns detected in an image frame.For example, a whether referred to as keypoints or not, the system caningest a frame, perform feature extraction and object detection anddetect specific known objects with a machine learning model. Theseobjects, rather than salient points such as corners, lines, blobs, orother patterns that can be tracked across frames in stereo and acrossframes in time, can be objects such as agricultural objects andlandmarks, such as leaves, weeds, rocks, terrain, trees, components ofirrigation systems, wheels of a vehicle, or any other landmarks wherethe system can identify the landmark itself, rather than just salientpoints that may or may not be associated to a landmark. For example,referring to diagram 1300 a of FIG. 12A, the agricultural observationand treatment system can detect a plurality of fruitlets, buds, andlandmarks in a single frame using a machine learning detector embeddedon board the system. From frame to frame, as the treatment system scansthe orchard, the image sensors of each of the component treatmentmodules configured to detect individual objects and landmarksthemselves, can detect objects, and match the object detections fromframe to frame for the purposes of SLAM and pose estimation of thesensor sensing the object, in addition to determining whether to trackan object to perform a treatment action. In another example, asillustrated in diagram 1600 of FIG. 14 , the component treatment modulesensing and analyzing the frame or image 1610 can detect everyagricultural object in the frame including carrots (or agriculturalobject not to treat) and weeds (agricultural objects detected fortreatment). The system can identify features of each object, and matchthe features from frame to frame for scene understanding and determiningmovement and orientation of the sensors capturing the frame (because theobjects themselves are not moving in space) relative to those objects,even though the agricultural object and treatment system under certainconfigurations are tasked to only target, track, and treat the weeds.The objects can be detected using feature extraction and objectdetection with various machine learning techniques, computer visiontechniques, or a combination thereof. Additionally, while the discussionabove focuses on image-based sensors, similar techniques can be appliedusing one or more lidar sensors for point cloud to point cloud matching.

In one example, in additional to performing functions to allow a systemto determine pose and therefore navigate in an environment, the systemcan use the same sensor readings used to determine pose estimation toperform dense reconstruction of a scene and map the agriculturalenvironment. This can be done with VSLAM which takes multiple viewsacross frames in time and in stereo to reconstruct a scene. Othertechniques can include dense reconstruction of a scene from SLAM orstructure from motion. Other techniques for improvements in scenegeometry in a sequence of frames captured by an image sensor can includelocal bundle adjustment to improve visual SLAM.

In one example, the agricultural objects discussed in this disclosurecan be any number of objects and features detected by various sensorsincluding image sensors. The agricultural objects can include varietiesof plants, different phenological stages of different varieties ofplants, even though the specific object detected in a geographic scene,having different physical features due to its growth is the samephysical object in the geographic scene, target plants to treatincluding treating plants to turn into a crop or treating plats forplant removal or stunting or stopping or controlling the growth rate ofa plant. Agricultural objects can include soil or patches of soil forsoil treatment. Other objects detected and observed by a treatmentsystem can include landmarks in the scene. Landmarks can be trees andportions of trees including spurs, stems, shoots, laterals, specificportions of the terrain including dirt, soil, water, mud, etc.,trellises, wires, and other farming materials used for agriculture.Additionally, landmarks can be any object that can be detected by theobservation and treatment system and tracked as a vehicle supporting thetreatment system is moving as well as tracked throughout time as thevehicle visits the location across a grow season in multiple trials orruns. For example, agricultural objects described throughout thisdisclosure can also be considered a landmark for tracking the landmarkeven though the observation and treatment system may not necessarily betargeting the object for treatment or tagging and indexing the objectfor observation. The landmarks can be tracked, using SLAM or othercomputer vision and machine learning based or assisted computer visiontechniques, to better locate a nearby landmark or object that will be atarget for treatment. For example, a plurality of landmarks andpotential landmarks that are also target objects can be detected in agiven image frame, or a pair of stereo frames, or sensed by a pluralityof sensors synched in time. Once the agricultural observation andtreatment system has identified

In one example, landmarks that can be tracked are not specificallydefined objects in the real world, but patterns or combinations ofobjects or features such as region in space separating one or moreunknown objects from a background. While the system may not be able toperform feature extraction enough to detect an object's identity, it canstill detect that an object, or a pattern created by one or moreobjects, exists in the captured sensor reading, and is that of one thatis fixed in space and will not move. These detections can also bereferred to as landmarks and used to track the landmarks for real timepose estimation. For example, landmarks detected as illustrated indiagrams 1300 a and 1300 b of FIGS. 12A and 12B can be used to generatekeyframes for further offline analysis including matching frames takenat different times of the same or similar view to create a time lapsevisualization of an object by comparing only keyframes instead of everyframe ingested by a sensor, determining which candidate keyframepreviously generated and stored by the system is used in real time toperform image correspondence for live/real time detections, targeting,and tracking, as well as used and tracked from frame to frame to performVSLAM, and generate better pose estimation for the sensor sensing theframes.

An agricultural object of interest can be a target plant for growinginto a harvestable crop. In one example, the agricultural object ofinterest can be a target plant to remove, such as that of a weed, or anyplant that is not a crop. In one example, the agricultural object can beportions of a soil of interest to observe and cultivate, such that atleast a portion of the cultivating process is treating the soil with oneor more fluid chemical treatments or fertilizer.

In one specific example, the agricultural observation and treatmentsystem can perform a variation of SLAM focusing on one or more specificfeatures to extract to more accurately generate pose estimation foragricultural observation and treatment systems performing inagricultural environments. For example, a system can be embedded with aSLAM algorithm, whether assisted by machine learning, other computervision techniques, or both, to detect tree trunks. In most orchards,tree trunks do not grow in size, change shape, or move in relativelyshort spans of time. Tree trunks are also spaced apart enough that eachtree trunk detected can allow a system to determine a different set ofagricultural objects detected in a tree are separated, which can be usedto determine which sets of frames in a plurality of frames can begenerated or used as a keyframe.

In another specific example, the agricultural observation and treatmentsystem can perform a variation of SLAM focusing on one or more specificfeatures to extract to more accurately generate pose estimation foragricultural observation and treatment systems performing inagricultural environments. For example, a system can be embedded with aSLAM algorithm, whether assisted by machine learning, other computervision techniques, or both, to detect beds, troughs, furrows, andvehicle tracks of a row crop farm, the beds being where plants areplanted and grow, and tracks being where vehicle wheels can travel.Instead of detecting arbitrary lines in a given frame, a machinelearning assisted SLAM algorithm can be configured to detect just thebeds and troughs, due to the beds and troughs looking substantially likelines that would take up an entire frame. This can help ease theperformance load on performing VSLAM as the vehicle only needs tooperate between two lines, detected as a trough, to better minimizedrift.

In one example, each of the compute units of each component treatmentmodules can associate local and global pose. While the compute unit andsensors associated with the navigation unit performs slam and accountsfor a vehicle pose (the vehicles components' own SLAM), for navigationpurposes, it can also combine to individually map every single plant,because the plant will have a known location due to its relativelocation to the component treatment modules' sensors, for example stereoimage sensors. The stereo image sensors will know where it is relativeto a global scene to do the mapping of a local scene and the real timelocalization of the vehicle itself in the global scene. The combinationwill allow the system to generate a global map with every single uniqueand individual agricultural object sensed, and indexed with highprecision in the global map itself.

In one example, a drone with one or more image sensors, lidars, GPS,IMU, can be configured to scan and map a scene of a geographic boundaryof an agricultural scene and combine the sensor readings of the globallymapped scene from the drone with that of the globally mapped scene fromthe navigation unit's sensors and compute units of the agriculturalobservation and treatment system and the locally mapped individualagricultural objects and landmarks imposed onto the globally mappedscene generated by the agricultural observation and treatment system.This would allow any views or scenes not necessarily or accuratelycaptured by the agricultural observation and treatment system on theground but could be captured by the done to be accounted for to createan even more accurate global map with more views of the map or indexeddatabase of a global map comprising a global scene from the drone withhigh definition readings of the geographic boundaries and views of theboundaries, a denser global scene of portions of the agriculturalenvironment including trees, rows, beds, troughs, and it's locationrelative to each other and in the global frame, as well as a local sceneof each individual agricultural object and landmark detected by eachcomponent treatment module.

In this example, individual crops, plants, agricultural objects, andlandmarks can be sensed and registered with location and pose relativeto image sensors sensing the individual crops, plants, agriculturalobjects, and landmarks. This will allow the agricultural observation andtreatment system, at each of the component treatment systems havingsensors and treatment units to identify more accurately where eachobject is relative to the local sensors of each component treatmentmodule in real time. Additionally, the location of the individual crops,plants, agricultural objects, and landmarks, relative to the localsensors of each component treatment module can be indexed and stored.Because the local sensors of each component treatment module are rigidlyconnected to each other with a support structure, and the supportstructure is also rigidly connected to the vehicle supporting thenavigation module and its sensors for obtaining global registry andglobal pose, The agricultural observation and treatment system cancombine the local and global pose to determine where each individualcrops, plants, agricultural objects, and landmarks sensed is located ina global scene. Thus, the agricultural observation and treatment systemcan be configured to create a global map of a scene with each individualobject and/or landmark detected in the global map with sub-centimeteraccuracy of where each individual object and/or landmark is located inthe global map, or at least digitizing and indexing an agriculturalscene without generating a visualizable map.

In one example, a vehicle or global pose estimation can be determined aswell as an individual localized pose estimation for each componenttreatment module can be determined.

For example, each treatment module, having one or more image sensors,one or more illumination sources, and one or more treatment units (beinga spray device or a light treatment device) and a compute unit alloperable and rigidly attached to each other as a single modulartreatment module, for example shown in diagram 900 of FIG. 9A and FIG.9B, each of the individual treatment modules can compute its own poseestimation. This in effect allows for each treatment unit to have a moreaccurate pose estimation, for example, a local pose estimation of thetreatment unit and the image sensors of the same treatment modulesupporting the treatment units of the treatment module locally to eachother and to the ground row plants or orchard trees that each of thetreatment module's sensors are sensing. Additionally, the sensors of thenavigation unit or navigation module, for example the navigation unitincluding a GPS sensor, one or more IMU's, one or more encoders, one ormore cameras, for example facing outward into the geographic area as awhole either facing forward or backwards of the direction of the vehiclepath, or one or more laser or lidar sensors, can be configured togenerate a global pose estimation for the vehicle itself In thisexample, each treatment unit of each treatment module can rely on boththe pose estimation determined by the treatment module's local sensorsand compute units, for the local pose estimation of each treatmentmodule, as well as the pose estimation determined by the navigationunit's sensors and compute unit, for the global pose estimation of thevehicle. The combined and fused pose estimation can be configured togive each treatment module a more accurate localization, orientation,and pose such that as each treatment module detects, targets, and trackseach object of interest detected, the treatment unit can target andtreat the agricultural objects of interest more accurately.

FIG. 7A illustrates an example method 702 that may be performed by someexample systems, subsystems, or components of systems, described in thisdisclosure either online, that is onboard a vehicle supporting one ormore modular agricultural observation and treatment systems, subsystems,or components of systems, or offline, that is at one or more servers oredge compute devices.

For example, at step 710, an agricultural observation and treatmentsystem can initialize. At step 720, the agricultural observation andtreatment system can determine a real-world geospatial location of thetreatment system. The determining of location can be performed bylocation-based sensors such as GPS, image-based sensors, such as camerasincluding CCD, CMOS, or Lightfield cameras, multispectral cameras, RGB-Dcameras, NIR cameras, the same two cameras in stereo or variations ofcameras synched in stereo, or more than two synchronized together,Lidar, laser sensors, motion sensors such as IMU, MEMS, NEMS, ormotion-based encoders. At step 730, the agricultural observation andtreatment system can receive one or more images or point clouds from oneor more image capture devices. At step 740, the agricultural observationand treatment system can identify one or more salient points or salientregions of at least a portion of a first frame. The salient points canbe keypoints and the salient regions can be keypoint regions, cluster ofpixels in a frame that is associated with a fixed object or points inspace that can be compared for similarities, including keypointmatching, across stereo sensors or across time by frames captured by thesame sensor from frame to frame. The salient points themselves can beobject based, instead of keypoint based, such as objects that can bedetected with a neural network using feature extraction and objectdetection. In one example, the salient points can be points or regionsdetected by a machine learning detector, keypoints generated usingvarious computer vision algorithms or by a machine learning detector ora machine learning assisted computer vision algorithm, or a combinationthereof. At step 750, the agricultural observation and treatment systemcan identify one or more salient points or salient regions of at least aportion of a subsequent frame. At step 760, the agricultural observationand treatment system can determine a change in position of the treatmentsystem based on comparing the first and subsequent frame. At step 770,the agricultural observation and treatment system can verify or improvethe determined change in position with the position determined by thelocation-based sensors, motions sensors, or both. At step 780, theagricultural observation and treatment system can determine a poseestimation. And at step 790, the agricultural observation and treatmentsystem can send instructions to activate actuators. The points ofinterest to track for motion estimation and SLAM can be that ofreal-world objects or patterns detected, or salient cluster of points inan image or point cloud that can be tracked from frame to frame or pointcloud to point cloud as a vehicle with image or point cloud sensors movein time. These can be detected by computer vision methods of detectingedges, corners, blobs, lines, etc., or by a machine learning algorithmconfigured to detect real world objects, such as agricultural objects,for example leaves for sensing systems pointed down at row crops, rocks,dirt, beds, troughs, crops, weeds, etc. For example, if a landmark suchas a small rock in the dirt, or a leaf of a crop, in a frame captured byan image sensor, the compute unit can determine that a cluster of pixelsof the frame comprise a “rock” detected by a machine learning algorithm,such that an object that is the rock detected can be tracked and matchedacross stereo vision system and across time, that is from frame toframe, and used to perform motion estimation and pose estimation bytracking relative position of the suite of fixed sensors, and byextension the position of the vehicle, the agricultural observation andtreatment system supported by the vehicle and each of its treatmentmodules, to the rock as the vehicle moves in a direction relative to therock, or any object detected by the machine learning algorithm.Alternatively, the compute unit can detect that a cluster of pixels inthe frame captured is different enough from a detected background of theframe such that the cluster of pixels, while the compute unit may notknow, extract enough features to determine and detect, that it is areal-world rock, is still a real-world object that is stationary and canbe tracked from frame to frame, and by extension, can be compared tofrom frame to frame, to perform motion estimation and pose estimation.In one example, the objects detected in real time can be used todetermine which areas detected in a geographic scene should be treated.Additionally, the objects detected in real time can also be used todetermine motion estimation and pose estimation of the sensor sensingthe objects themselves, and by extension the pose of the vehicle andagricultural observation and treatment system and each subsystem, forexample each treatment module, on board the vehicle.

FIG. 7B and FIG. 7C illustrate additional example methods of 702 thatmay be performed by some example systems, subsystems, or components ofsystems, described in this disclosure.

Alternatively, or additionally, at step 752, the agriculturalobservation and treatment system can compare one or more salient pointsor regions of the first frame with the one or more salient points orregions of one or more subsequent frames. This can be done with a cameraby comparing frame to frame, can be done with two or more cameras bycomparing left and right, or top and bottom frame for depth, or fromframe to frame from a first camera to the next or from the first camerato the second camera, or a combination thereof. This can also be donewith different types of cameras, including comparing and matchingsalient points of an image from a visible color image with that of aninfrared image and/or with that of an ultraviolet image and gettingstructure or the object or salient points, location, and pose from thecomparison, not necessarily from motion of from frame to frame in timefrom the same sensor, but from frame 1 of camera 1, frame 1 of camera 2and frame 1 of camera 3 at the same time. At step 754, the agriculturalobservation and treatment system can generate one or more 3D models ofobjects identified in at least one or the first or subsequent frames,associated with objects in the real world. In one example, the objectcan be target objects of interest, objects related to sprays such ascapturing the spray projectile itself, splash health, and splatdetection, referring to whether a spray projectile hit the target bymeasuring the splat size and location of the projecting creating a“splat” on the object and/or surface of the ground near the object.

Alternatively, or additionally, at step 756, the agriculturalobservation and treatment system can generate one or more global maps ofa real-world geographic area including objects identified in at leastone of the first or subsequent frames, or both, associated with objects,landmarks, or both in the real world.

In one example, tracking to find the same feature from a first framedetected to subsequent frames, for example using Lucas-Kanade tracking,under the assumption that the object does not move far away from frameto frame, for tracking an object on a moving vehicle. The featuredetected in the first frame for tracking can be real world objectsdetected and identified by one or more machine learning algorithms onboard, or by real time edge compute, the treatment modules of thetreatment system performing observation and actions in real time online,objects that can be represented by a cluster of salient points in aframe, such as corners, lines, blobs, edges of an object, detected bycomputer vision techniques such as FAST, SIFT, SERF, or other techniquesdescribed in this disclosure.

In one example, the objects detected in real time can be used todetermine which areas detected in a geographic scene should be treated.Additionally, the objects detected in real time can also be used todetermine motion estimation and pose estimation of the sensor sensingthe objects themselves, and by extension the pose of the vehicle andagricultural observation and treatment system and each subsystem, forexample each treatment module, on board the vehicle.

FIG. 8 illustrates an example schematic block diagram of componentrythat may be utilized with a system 800 similar to that of agriculturalobservation and treatment systems described previously in thisdisclosure. The system 800 may include a sub-system 802 thatcommunicates with one or more perches, or treatment modules 804. Thetreatment module 804 can be a component of a modular system of one ormore treatment devices. In each treatment module 804, the treatmentmodule 804 can include, one or more image sensors 820 and 822, and oneor more illumination units 824. In one example, an agriculturalobservation and treatment system, described in this disclosure, can bereferred to as a portion of a system for observing and treating objectsthat is onboard a moving vehicle. Performances by the portion of thesystem onboard the moving vehicle, including computations, and physicalactions, can be considered online performance or live performance. Aportion of the system comprising one or more compute or storagecomponents, that are connected as a distributed system, can beconsidered the offline portion of the system configured to performremote computing, serve as a user interface, or storage. In one example,the agricultural observation and treatment system is a distributedsystem, distributed via cloud computing, fog computing, edge computing,or a combination thereof, or more than one subsystem is performingcomputations and actions live in addition to the portion of the systemonboard a moving vehicle. In one example, treatment modules, a pluralityof treatment modules, or a first, second, etc. treatment modulesdiscussed in this disclosure can be treatment module 804 of diagram 800.The treatment module can be a subsystem having a compute unit, sensors,including image capture sensors, illumination devices, and one or moretreatment units comprised of one or more nozzles guided by a gimbal orturret mechanism, local to each treatment module. The totality ofmultiple modular treatment modules described in this disclosure,including treatment module 804 of diagram 800, can be part of a greateronline or on-board agricultural observation and treatment system havinga global compute unit and sensors sensing a greater geographicagricultural environment and communicating, sending, and accessinginformation and instructions to each modular treatment modulesubsystems. And the on board agricultural observation and treatmentsystem, supported by a vehicle and/or one or more edge compute devicesto perform computing functions, can be a subsystem or a node of agreater agricultural observation and treatment system having a meshnetwork of onboard observation and treatment systems across a fleet ofvehicles operating in multiple geographic areas, for example at multipledifferent farms and orchards having different crops requiring differentobservation and treatment services, and backend servers, compute andcloud compute subsystems configured to perform offline functions such asingesting performance logs and sensor data captured at each farm,perform analysis and quality analysis, perform training on one or moremachine learning models that can be uploaded to each on site or on boardsystem, and many other nodes including user interface for a user toengage in the functionalities discussed above.

The treatment module 804 can include a compute unit 806, which caninclude a CPU or system on chip, that sends data and instructions to anECU 818, or daughterboard ECU, for synchronization of operation of oneor more illumination units 824 and operation of image sensors 820 and822. The ECU 818 can sends/receives data to one or more cameras of imagesensors 820, and/or one or more cameras of image sensors 822, and one ormore illumination units 824 each including a light bar of LEDs,including instructions by the ECU 828 to activate the image sensors 820and 822 and illumination units 824.

The system 800 can also include a navigation unit 802 configured tointerface with each treatment module 804. The navigation unit 802 caninclude one or more components and modules configured to receivepositional, velocity, acceleration, GPS, pose, orientation, andlocalization and mapping data. In one example, the navigation unit 802can include a vehicle odometry module 808 with encoders and imagesensors to perform wheel odometry or visual odometry and process imagesand vehicle movement to calculate and determine a position andorientation of the vehicle supporting the system 800. The navigationunit can also include an IMU module 810 with one or more IMU sensors,including accelerometers, gyroscopes, magnetometers, compasses, and MEMand NEM sensors to determine IMU data. The navigation unit 802 can alsoinclude an GPS module 811 to receive GPS location data, for example upto a centimeter accuracy. The navigation unit can also include a SLAMmodule 812 for performing a simultaneous localization and mappingalgorithm and application for mapping an environment including anagricultural geographic boundary such as a farm, orchard, or greenhouse,and determining localization and orientation of a vehicle supporting thesystem 800, components of the system 800 relative to the geographicboundary, as well as localization and orientation of agriculturalobjects and scenes detected by the system 800. The SLAM module 812 cantake sensor data from one or more cameras, including stereo visioncameras, cameras that are omnidirectional, cameras that are movingrelative to the vehicle, or other sensors 813 including LiDAR sensors.The LiDAR sensors can be flash LiDAR sensors or static LiDAR sensors,spinning LiDAR sensors, other rangefinders, and other sensors discussedabove. As the navigation 802 receives sensing data related tolocalization and mapping, a compute unit 806, including a CPU or systemon chip, of the navigation unit 802 can fuse the sensing signals andsend the data to each of the treatment modules 804 or to a remotecompute unit or server through a communications module 840. The sensingcomponents of the navigation unit 802 can be activated and controlled byan ECU 814. The ECU 814 can also be configured to interface, includingactivation and power regulation, with each of the treatment modules 804.

The treatment module 804 can also include a treatment unit 828configured to receive instructions from the compute unit and ecu 818including treatment parameters and treatment trajectory of any fluidprojectile that is to be emitted from the treatment unit 828. A chemicalselection unit 826 can include one or more chemical pump(s) configuredto receive non-pressurized liquid from one or more chemical tanks 832and operable to each treatment units of each of the treatment modules804, or multiple treatment units 828 of each treatment module 804. Oneor more chemical tanks 832 may have different types of chemicals. Thechemical pumps can send stored liquid or gas from the one or morechemical tank(s) 832 to one or more regulators 834, which will furthersend pressurized liquid to one or more other components in series as thepressurized liquid reaches the one or more treatment units 828 of system800. Other components in the series of the chemical selection unit 826can include an accumulator and chemical mixer 836 (described in previoussections of the disclosure). The treatment unit may emit the liquid at aparticular trajectory in order for the fluid projectile to come intocontact with an object and at a particular physical location.

In one example, as a vehicle performs a trial on a geographic boundary,each of the treatment modules 804 can perform actions independently ofeach other. Each treatment module 804 can receive its own imageacquisition and processing of images for treatment. The treatmentparameters can be determined locally on each treatment module 804,including object detection and classification of agricultural objects ina scene as well as determining treatment parameters based on the objectsand features detected. The processing and be performed by each computeunit 806 of each treatment module 804. Each of the treatment modules 804can receive the same data sensed, fused, and processed by navigation,vehicle orientation and position data from the navigation unit 802 sinceeach of the treatment modules 804 will be supported by the same vehicle.In one example, each of the treatment modules 804 can share the samechemical selection component 826. In one example, multiple chemicalselection units 826 can be configured to connect and interface with eachtreatment module 804 where one treatment module 804 can be configuredwith one chemical selection unit 826.

FIG. 9A illustrates an example modular treatment module 900, or perch.In one example, the modular treatment module 900 may be configured withmultiple illumination units 910 mounted to a frame 902, 903 orsupporting structure. The modular treatment module according to variousexamples may include multiple illumination units 910 of LED lights.Illumination unit 910 may include one or multiple LED lights includingan array of LED lights. The LED lights can each be packaged in anenclosure for better mounting of the LED lights to the rest of themodular treatment module. For example, a light enclosure can support 4individual LED Lights, each LED light can include a plurality of LEDdiodes to illuminate light. In another example, the LED Lights can bestandalone, supported by a structure or heatsink and individuallymounted to the rest of the treatment module. In one example, each LEDlight, having a plurality of LED diodes, can include one or more lensesto focus the illumination intensity, direction, or illumination area.The modular treatment module 900 may include a camera enclosure, orcamera bank 904 that includes one or more cameras or other image sensingdevices. In one example, the illumination units 910, treatment units1100, supported by treatment unit frame 903, can all be operably mountedand connected to the camera bank 904 having a camera enclosure. Theinner two cameras may be identification cameras to obtain digitalimagery of agricultural objects, and the outer two cameras may becameras used to obtain imagery of agricultural objects being treatedincluding the treatment projectile, treatment profile, splat detection,treatment health and accuracy. In one example, a pair of stereo camerascan be configured to ingest frames at a high frame rate and at a highexposure rate or refresh rate shutter speed. For example, the camerascan ingest images up to 8K definition per frame at 2040 Frames capturedper second. In this example, the compute unit embedded and enclosed inthe camera bank 904, configured to send instructions and read inputsfrom each of the treatment units 1110, sensors including cameras,illumination units 910, and so forth, action as the main compute unit ofthe component modular treatment module 900, can receive differentdownsamples of image quality and number of frames per second. Forexample, one or more FPGA's, ASICs, or one or more microcontrollers canbe embedded at the each of the camera modules, such that the camera'sexposure and shutter speed receives 8K definition frames at 2400 framesper second. The FPGA, ASICs, or one or more microcontrollers canautomatically sample the images into smaller resolution images intosmaller resolution images at fewer frames per second sent to the computeunit. Additionally, it can send more than one of different types ofimages packets to the compute unit such that the compute unit receivesdifferent streams of data captured by the same pair of image sensors.For example, the 8K frame captured at 2400 frames per second can be downsampled to 4K frames at 30 frames per second at the FPGA/ASIC level, andthen sent to the compute unit so that the compute unit can partition atask to analyze 4K image frames at 30 frames per second. In one example,it can partition a task to analyze a stereo pair of 4K image frames at30 frames per second. Additionally, the 8K frame captured at 2400 framesper second can be down sampled to 1080p frames at 240 frames per secondat the FPGA/ASIC level, and then sent to the compute unit so that thecompute unit can separately partition a task to analyze 1080p imageframes at 240 frames per second, both streams of image data coming fromthe same pair of cameras. This would reduce the need for two sets ofstereo cameras enclosed in a single camera bank. The disclosure above isfor illustration purposes only, and the FPGA/ASIC and othermicrocontrollers described can downsample the image stream to any typeof quality and speed to send to the compute units for analysis. Thecamera module itself can account for auto balance, auto white balance,auto exposure, tone, focus, as well as synchronization with the LEDlights including the LED light's exposure, temperature, peak to peakexposure time, as well as perform color correction algorithms to theimages before the images are sent to the compute unit for analysis. EachLED light may be synchronized to turn on and off with respect to when anidentification camera(s) is capturing an image. The number of cameras orother sensing devices, as well as the number of individual LED Lightsare for illustration purposes only. In one example, more than twotreatment units 1110 can be supported by a single modular treatmentmodule 900 or a part of a single modular treatment module 900. Themodular treatment module 900 can include a varying number of sensingenclosures, illumination modules, and treatment units, all operablyconnected to each other as one treatment module similar to that oftreatment module 804.

FIG. 9B illustrates an alternate configuration of an example modulartreatment module 902. The module treatment module 902 can include asupport structure and components supported by or embedded in the supportstructure, including a treatment unit and a treatment unit supportstructure 923, one or more image sensors 918 including a compute unitand image sensor box or enclosure 916, and one or more illuminationunits 920 having one or more LED Lights with one or more lenses.

FIG. 10 illustrates an example method 1000 that may be performed by someexample systems, subsystems, or components of systems, described in thisdisclosure. For example, at step 1010, the agricultural treatment systemcan determine a first real-world geo-spatial location of theagricultural treatment system. At step 1020, the agricultural treatmentsystem can receive one or more captured images depicting real-worldagricultural objects of a geographic scene. At step 1030, theagricultural treatment system can associate the one or more capturedimages with the determined geo-spatial location of the agriculturaltreatment system. At step 1040, the agricultural treatment system canidentify, from a group of indexed images, mapped images, previouslyassigned images, or representations of agricultural objects including atleast in part, image data and position data, or a combination thereof,one or more images having a second real-word geo-spatial location thatis proximate with the first real-world geo-spatial location. At step1050, the agricultural treatment system can compare at least a portionof the identified images with the one or more captured images. At step1060, the agricultural treatment system can determine a target objectbased on the comparing at least a portion of the one or more identifiedimages with at least a portion of the one or more captured images. Atstep 1070, the agricultural treatment system can emit a fluid projectileat a target object in the real-world with a treatment device. The targetobjects are real-world objects that are intended to be sprayed with thefluid projectile.

The agricultural treatment system may store the group of images in anonboard data storage unit or a remote storage unit. The group of imagesmay include key frame images and sub-key frame images. The key frameimages may depict agriculture objects of the geographical scene, and thesub-key frame images may depict a portion of a key frame image, forexample a portion of a key frame image can be an image of anagricultural object or cluster of agricultural objects. The key frameimages may be images that were previously obtained by image sensors ofthe system. The captured digital images may be obtained by the samecameras of the system at a time subsequent to when the key frame imageswere taken. For example, in one trial run, the agricultural treatmentsystem, or similar systems 100, 600, and 800, can perform observationsof a geographic boundary including detecting and indexing any and allagricultural objects captured by image sensors, and perform one or moreprecision treatments on detected agricultural objects on the geographicboundary, such as a farm or orchard. The agricultural treatment systemcan index each image captured by its on-board vision system includingone or more image sensors configured to capture images of agriculturalobjects or crops, or offline at a remote computing location nearby thephysical location of the geographic boundary or at different remotelocation such that the remote computing units can communicate with theagricultural treatment system. The indexed series of images captured byimage sensors can be further indexed, where one or more of the capturedimages can be assigned as a keyframe, include a unique keyframe marker.Each keyframe can represent image that include one or more uniqueagricultural object or landmark of interest in the real world. Becauseof the navigation unit of agricultural treatment system, the keyframescan include location data and a timestamp. For example, the agriculturaltreatment system, in a trial, can capture a series of captured images asthe vehicle travels along a path in the geographic boundary. The seriesof images captured can be images taken of a row of plants including rowcrops grown directly from the soil or crops growing off trees. One ormore images of the series of images captured can include agriculturalobjects of interest, either for treatment or for observation where theagricultural object can grow into a stage where it is desirable toselect a treatment for the agricultural object. The agriculturaltreatment system can assign the particular image having the individualagricultural object identified as a keyframe. The keyframe, or any otherimages captured by the agricultural treatment system can include alocation based on image analysis performed by the compute unit of thetreatment system. For example, a stereo vision system can use epipolargeometry to triangulate a location of an object identified in an imagerelative to the location of the image capture device.

Additionally, each portion of the image that includes agriculturalobjects can be labeled and assigned a unique identifier to be indexed ina database. The data indexed can be a 2d or 3d constructed image of anagricultural object having a location and position data attached to theimage and a timestamp of when the image was taken. In future trialsconducted by the agricultural treatment system, the agriculturaltreatment system may capture images of the same agricultural object atthe same or similar location in the geographic boundary. Since the imagecaptured of the agricultural object in the same position was acquired ata future time from the previously captured agricultural object, theagricultural object may have grown to have different features. In oneexample, the agricultural treatment system can determine that anacquired image of an agricultural object with location and positiondata, is associated with that of a previously acquired, labeled,assigned, and indexed image or other indexed representation of anagricultural object that is the same agricultural object as thecurrently detected object. Having associated the two images withlocation and timestamp data, the agricultural treatment system candetermine treatment parameters, including whether to perform a treatmentat the given time or trial, determining a mixture, chemical type,volume, concentration, etc., of a treatment, and a precise trajectoryfor the treatment to be deposited on a surface of the agriculturalobject. In one example, a user can select in an application the indexedagricultural object, and a user interface of the agricultural treatmentobject can display information related to the agricultural objectincluding images taken of the agricultural object, including multipleimages taken at different locations, and with orientations of the imagecapture device, for capturing different views of the same agriculturalobject, as well as multiple images taken at different points in time asthe agricultural treatment system conducts multiple trials and capturesimages of the same or near the same location as previously capturedimages.

The above example illustrates the agricultural treatment systemperforming two trials with two sets of images captured at differenttimes, for example a day apart, of the same agricultural object andassociating the images of the agricultural object with each other basedon image features detected that are common between the images, position,depth, localization, and pose related information from image analysisand computer vision techniques, as well as similar position datacaptured by the navigation unit of the agricultural treatment system. Asmore trials are conducted and more images of a same agricultural objectare taken, capturing the agricultural object's current growth stage, andassociating each captured agricultural object with one or morepreviously captured images of the same agricultural object, thetreatment system can build a unique profile of each unique andindividual agricultural object mapped in a geographic boundary,including images associated with each of its growth stages, any and alltreatment history to each individual agricultural object. This can allowa user or a treatment system to determine a crop's health, includingdiseases and stress, for example for fire blight detection, and colorchange, size, count, growth projection, yield projection and estimationof the crop grown on a farm or orchard and allow a user optimize growingcrops on a farm by observing and controlling the growth rate of eachindividual agricultural object detected on a geographic boundary.

In one example, to identify target objects for spraying, the system maycompare at least a portion of the identified images by comparing thesub-key frame image to a portion of one of the captured images. In otherwords, the agricultural treatment system can compare one or more patchesor labeled portions of a previously indexed image of an agriculturalobject with at least a portion of the currently captured image. In thisexample, a patch is an image cropped out of a bigger image having one ormore features of interest. The features of interest in the bigger imagecaptured by image sensors can include agricultural objects, landmarks,scenes or other objects of interest to be identified, labelled, andassigned a unique identifier or marker to be indexed. For example, abounding box of an image, or other shape, can be drawn around a portionof an image, cropped out and separately indexed by the agriculturaltreatment system and saved as a patch for comparing against capturedimages taken in the future, for building a digitized map of a geographicboundary, for associating an object captured during one trial with thesame object captured at different trials, or a combination thereof. Thesystem determines a confidence level of whether the sub-key frame imagematches the portion of the captured image. The system identifies a matchwhere the determined confidence level meets or exceeds a predeterminedconfidence level threshold value. In one example, various computervision techniques can be applied to compare and correspond images anddetermine similar features for matching. This can include templatematching for comparing a portion of an image with the region of interestof another image, normalized cross correlation, random sample consensus(RANSAC), scale-invariant feature transform (SIFT), FAST, edgeorientation histograms, histogram of oriented gradients, gradientlocation and orientation histogram (GLOH), ridge and edge detection,corner detection, blob detection, line detection, optical flow,Lucas-Kanade method, semantic segmentation, correspondence matching, andother computer vision and matching techniques. The system may identifythat a captured image includes a target object to be treated or a targetobject that was already sprayed and does not currently need a treatmentbased on features detected of the agricultural object, based on itstreatment history, or a combination thereof. Based on determining thelocation of the image sensors of the agricultural treatment system, thelocation of the target object in the obtained image, the system can thenconfigure, orient, and prepare the treatment unit such that a fluidprojectile when emitted, would be sprayed in a trajectory to emit fluidonto the real-world targeted agriculture object.

In another example, the system may use landmark features or objects todetermine locations of target objects to be sprayed. The landmarkobjects are real-world objects that aid in determining the location of atarget object. The system may identify a landmark object in a capturedimage and determine a portion of the landmark object in the captureimage matches a portion of an image from the group of images. While notintended to be an exhaustive list, examples of landmark object mayinclude a man-made object, a fence, a pole, a structure, a portion of aplant structure, a portion of a tree structure, a leaf formation or aleaf cluster that can be used to mark a specific location of ageographic boundary or distinguish a specific keyframe for having theunique landmark assigned to the portion of the keyframe.

In another example, in one mode of operation, in a first pass along apath along an agricultural environment, the agricultural treatmentsystem obtains a first set of multiple images while the system movesalong the path. For example, the agricultural treatment system usesonboard cameras and obtains multiple digital images of agriculturalobjects (e.g., plants, trees, crops, etc.). While obtaining the multipleimages of the agricultural objects, the agricultural treatment systemrecords positional and sensor information and associates thisinformation for each of the obtained images. Some of this informationmay include geo-spatial location data (e.g., GPS coordinates),temperature data, time of day, humidity data, etc. The agriculturaltreatment system or an external system (such as a cloud-based service)may further process the obtained images to identify and classify objectsfound in the images. The processed images may then be stored on a localdata storage device of the agricultural treatment system.

In a second pass along the agricultural environment, the agriculturaltreatment system using the onboard cameras obtains a second set ofmultiple digital images using along the path that had been previouslytaken along the first pass. For example, the agricultural treatmentsystem may obtain the first set of multiple images on day 1, with theimages capturing blossoms on a group of apple trees. The digital imagesdepicting the apple trees may be processed for object classification ofthe types of blooms depicted in the digital images. The agriculturaltreatment system may retrieve the processed imagery and associated dataidentifying the objects and classified types. On day 2, the agriculturaltreatment system may again follow the original path and obtain newimagery of the apple trees. The agricultural treatment system may thenuse the second set of obtained images in comparison with the receivedprocessed images to identify target agricultural objects to be sprayed,and then spray the agricultural objects. The system then can match thelandmark objects to aid the system in determining locations of targetobjects. In other words, the system may use feature matching of objectsin the imagery to determine that a prior image is similar to a capturedimage.

For example, the processed images received by the treatment system, mayhave associated positional information. As the agricultural treatmentsystem moves along the path in the second pass, the agriculturaltreatment system may compare a subset or grouping of the processedimages based on location information associated with the processedimages, and a then current position or location of the treatment system.The agricultural treatment system compares new images to the processedimages and determines whether the images or a portion of the images aresimilar. The agricultural treatment system may then identify a locationto spray based on a likely location of a target object in the processedimages.

As noted above, the agricultural treatment system may associate imagescaptured by a camera(s) with real-world physical locations of whereimages of agricultural objects were obtained. For example, while avehicle with an agricultural treatment system is moving along a path, anelectronic control unit of the agricultural treatment system maygenerate camera data signals and light data signals with synchronizedlighting from the lighting devices of the agricultural treatment systemand the capturing of digital images. The ECU may synchronizeillumination, by one or more lights mounted on the vehicle, of thephysical location of an object(s) for generation of the respectivecaptured image(s) that corresponds with that physical location of theobject(s). The object determination and object spraying engine sends thecamera data signals and light data signals to ECU. The objectdetermination and object spraying engine generates position informationthat corresponds with a position and an orientation of the vehicle withrespect to physical location(s) of the agricultural object(s) and acurrent route of the moving vehicle. The position information mayfurther be associated with the respective captured image(s) thatcorresponds with the physical location(s) of the agricultural object(s).

FIGS. 11A-B illustrate example implementations of method 1200 that maybe performed by some example systems, subsystems, or components ofsystems, described in this disclosure. For example, in one mode ofoperation, at step 1210, an agricultural treatment system can receiveimage data from one or more sensors, the image data including one ormore agricultural objects. The one or more agricultural objects can beidentified as one or more target plants from the image data. At step1220, the agricultural treatment system can receive agricultural datarepresenting agricultural objects including different crops and targetplants. At step 1240, the agricultural treatment system can identify alocation of the target plant. At step 1250, the agricultural treatmentsystem can determine treatment parameters of the target plant. At step1260, the agricultural treatment system can compute a vehicleconfiguration and treatment unit configuration for treating the targetplant. At step 1270, the agricultural treatment system can lock thetreatment unit onto the target plant in the real world. At step 1280,the agricultural treatment system can activate the treatment unit andemits a fluid projectile of a treatment chemical onto the target plant.

Additionally, the agricultural treatment system can receive, fuse,compute, compensate, and determine positional, localization, and poserelated signals on a geographic boundary. At step 1212, the agriculturaltreatment system can receive sensor data, from one or more sensors on avehicle of an agricultural environment. The agricultural environment canbe that of a geographic boundary having a plurality of objects typicallyfound on a farm or orchard for cultivating land and growing andharvesting crops. At step 1214, the agricultural treatment system canidentify a vehicle position, one or more agricultural objects inproximity of the vehicle, and determine distances of the vehicle to theagricultural objects. At step 1216, the agricultural treatment systemcan calibrate the vehicle, including calculating a pose estimation ofthe vehicle relative to a central or known point in the geographicboundary, pose estimation of components of the agricultural treatmentsystem relative to the vehicle supporting the agricultural treatmentsystem, or agricultural objects detected in space relative to thevehicle. The vehicle can be calibrated by locating one or morecalibration targets spread throughout a mapped geographic boundary suchthat as the agricultural treatment system identifies a physicalcalibration target and calculates its position relative to thecalibration target, the agricultural treatment system can determine, orcorrect a previous inaccurate determination, a position of the vehiclein the geographic boundary.

FIGS. 12A-12B illustrate example images obtained by an agriculturalobservation and treatment system described in this disclosure. As shownin the diagram 1300 a of FIG. 12A, an image received by an agriculturaltreatment system may include multiple identifiers of different types ofobjects, for example objects 1302, 1304, 1306, 1308, and/or 1310, of aplurality of objects detected, each having different identifiersportrayed in a captured image. For example, an identifier marked forobject 1306 or 1308 may identify a portion of the captured image thatportrays a physical landmark of an of an agricultural object orlandmarks in an agricultural scene. The object 1306 may further be basedon visual characteristics of the object.

The diagrams 1300 a and 1300 b representing images with one or moredetections can either be ingested images by a compute unit of acomponent treatment module with machine learning, computer vision, orboth, based detections performed by feature extraction and objectdetections in real time while the treatment module is scanning anenvironment, or representing images with labels performed by humanlabelers, machine learning detections, or a combination thereof where amachine learning detector scans and detects objects and landmarks in agiven frame, and a human labeler verifies the quality of the detectionsand manually labels missing or incorrectly classified objects.

FIG. 12B illustrates another image 1300 b depicting another examplereal-time captured image with real-time object detection or a receivedlabelled image having the labelling of objects in the received imagedone offline from the portion of agricultural observation and treatmentsystem supported by a vehicle. Additionally, diagram 1300 b illustratesexample portions or sub-images of an image obtained by an agriculturaltreatment system.

In one example, diagram 1300 b is a labelled image, either fromreal-time performed by an agricultural observation and treatment systemon the vehicle, or offline at a server, by a human, by a machinelearning algorithm, assisted by a machine learning algorithm, or acombination thereof.

Based on visual characteristics of an instance of an apple blossomportrayed by the captured image of an apple tree, the labeled image mayinclude an identifier 1302 b for the apple blossom instance. Theidentifier 1302 b may be positioned in the labeled image 1300 b at afirst pixel position that corresponds to the apple blossom instance'sphysical location as it is portrayed in the captured image of the appletree. Based on visual characteristics of an instance of an applefruitlet portrayed by the captured image of the apple tree, the labeledimage may include an identifier 1310 b for the apple fruitlet instance.The identifier 1310 b may be positioned in the labeled image at a secondpixel position that corresponds to the apple fruitlet instance'sphysical location as it is portrayed in the captured image of the appletree. Based on visual characteristics of an instance of a landmarkportrayed by the captured image of the apple tree, the labeled image mayinclude an identifier 1308 b for the landmark instance, the specificlandmark identifier 1308 b being that of two branches diverging in thespecific pattern, shape, and orientation illustrated in 1300 b. Theidentifier 1308 b may be positioned in the labeled image 1300 b at athird pixel position that corresponds to the landmark instance'sphysical location as it is portrayed in the captured image of the appletree.

In one example, to perform better VSLAM in an agricultural scene,certain objects that are landmarks that are tracked across time canimprove the quality of VSLAM and pose estimation, for example, largeenough stationary objects typically found in the specific agriculturalscene. Landmarks can be used to identify which frames are of interest tostore, store as a keyframe (because one does not need so many frames atonce all having the same fruits, or detected objects, from frame toframe), and to be used to identify objects in real time and tracked forvisual based navigation and mapping including VSLAM. Because there arespatial locations to each of the objects, landmarks, and its uniqueidentifying characteristic. In one example, tree trunk 1336 can bedetected, by a machine learning algorithm or programmatically predefinedas stationary dark objects that protrude from the ground. Detection andtracking tree trunks in an orchard can allow a system to partition anagricultural environment by the trees themselves, such as to minimizeerror in detecting one cluster of objects and thinking its origin is atonce place, when it should be at another. For example, a system candetect a first tree trunk having a first location in global scene, aswell as determine a pose of the system itself relative to the tree trunkdetected. The system will also detect a plurality of objects, includingits identity as well as whether that unique object was detected beforeeither with the same identifier, or a different identifier, being thatthe phenological state of the object has changed, but still the sameobject in space. In this example, the system can associate a cluster ofobjects detected, being on the same tree, with the tree trunk detected.In this case, if the system incorrectly detects other objects orlandmarks at different and nearby trees due to its pattern being similarto a previously identified pattern or object, and it's location basedsensors are not accurate which the change in location was not detectedfrom a first object, pattern or landmark located near a first tree trunkand a second object, pattern, or landmark located at a second treetrunk, for example if the GPS sensor is off by a few meters or did notupdate in time, An additional checking point for the system can bedetecting a first tree trunk and a second tree trunk. Because the systemknows that two different tree trunks must be far enough apart from eachother, the system can determine that a previously detected objectdetermined to be a certain location is likely wrong due to the systemalso determining that the object detected was in proximity to anothertree trunk that could not have been located at a different location.

While tree trunks are unique to orchards, any large, stationary objectsor patters that are unique to the specific geographic environment can beprogrammatically detected to better improve spray performance,navigation performance, and mapping of the scene. For example, detectingbeds, troughs, furrows, and tracks of a row crop farm can be used toimprove performance of observing and performing actions in the row cropfarm. The techniques used can be a combination of computer vision,machine learning, or machine learning assisted techniques in detectingbeds, troughs, furrows, and tracks such as long lines in a capturedframe, differences in depth between lines (for example tracks and bedswill have substantially the same line pattern because they are next toeach other but have different depths), which can be detected with depthsensing techniques and detecting changes in color between beds andtracks, for example.

In one example, the object determination and object spraying enginegenerates positional data for an instance of the fruit at a particularstage of growth that is portrayed in a captured image based in part on:(i) a pixel position of the portrayal of the instance of a fruit at theparticular stage of growth in the labeled image (and/or the capturedimage), (ii) the position information of the moving vehicle, and/or(iii) previously generated position information associated with aprevious captured image(s) of the instance of the fruit and the physicallocation of the instance of the fruit. Previously generated positioninformation may be associated with captured and labeled images thatportray the same instance of the fruit when the vehicle traveled asimilar route during a previous time, such as a prior hour of the day,prior day, week and/or month. The agricultural treatment system maygenerate nozzle signals for the synchronization ECU of the agriculturaltreatment system on a vehicle based on the positional data for theinstance of the fruit at the particular stage of growth. For example,the nozzle signals may indicate a physical orientation of the nozzle tocreate a trajectory for a liquid. The nozzle signals may represent achange in a current orientation of the nozzle based one or more axialadjustments of the nozzle.

The object determination and object spraying engine sends the projectilefrom the nozzle towards the physical location of the object according tothe trajectory. For example, the object determination and objectspraying engine adjusts a current orientation of the nozzle according tothe nozzle signals and triggers the nozzle to spray a liquid towards thephysical location of the instance of the fruit.

Because not all plants need the same amount, for example by type,volume, frequency, or a combination thereof, of treatment based on thestage of growth of the particular plant, the agricultural treatmentsystem can be configured to scan a row of crops to identify the stage ofgrowth of each individual crop or agricultural object that is a plant orportion of a plant and determine whether the identified crop oragricultural object needs a treatment on the particular trial run, orday, or at the particular moment in time the vehicle with agriculturaltreatment system is on the field and has detected the individualagricultural object. For example, a row of crops, even of the same kindof plant, can have a plurality of agricultural objects andsub-agricultural objects of the agricultural objects, where theagricultural object may depict different physical attributes such asshapes, size, color, density, etc.

For example, a plant for growing a particular type of fruit, in oneagricultural cycle, can produce one or more individual crop units, forexample a fruit tree, each taking the shape of a first type of bud,second type of bud, and so forth, a flower, a blossom, a fruitlet, andeventually a fruit, depending on a growth stage of a particular crop. Inthis example, the agricultural treatment system can label each stage ofthe same identified object or crop, down to the particular individualbud, on the fruit tree as different agricultural objects or subagricultural objects, as the object changes in its growth stageincluding its particular shape, size, color, density, and other factorsthat indicate a growth into a crop. The different agricultural objectsdetected and labelled associated with the same object in the real-worldspace can be associated with each other

For example, a bud detected can be labelled as a unique agriculturalobject with a unique identifier or label. As time moves forward in aseason, the uniquely labelled bud that is mapped on a farm may changeshape into a flower for pollination, or from a flower to a fruitlet, andso forth. As this happens, an agricultural treatment system can identifythe flower and label the flower as a unique identifier to theagricultural object detected and associate the agricultural object thatis the flower with the agricultural object that is the bud previouslyidentified and logically link the two identified agricultural objects asthe same object in the real world where one object identified has growninto the other. In another example, the unique real-world flowerdetected, of a plurality of flowers and other objects in a geographicboundary, can be labelled as a flower but not considered a differentagricultural object, and instead be associated with the sameagricultural objected previously labeled as a bud. In this example, eachobject detected that can be considered a potential crop can be mapped asthe same agricultural object, even though the agricultural object willchange shape, size, density, anatomy, etc. The same agricultural objectdetected in the same space at different times can then have differentlabels and identifiers as related to the stage of growth. For example, afirst agricultural object in space, detected by the agriculturaltreatment system, can be identified and indexed as a real-worldagricultural object #40 with a timestamp associated with the time of dayand year that the agricultural treatment system captured one or moreimages or other sensing signals of agricultural object #40. At themoment in time of identification, the agricultural object #40 can have afirst label and assign the first label to agricultural object #40. Thefirst label can be labelled as a bud, or bud #40 since there may be manyother buds detected in the geographic boundary such as a farm ororchard. As multiple trials across a span of time are conducted in thegeographic boundary on the same agricultural object #40, theagricultural object #40 can turn from a first type of bud, such as adormant bud, into a second type of bud, or from a bud and bloom into aflower, or many other changes in stages of growth of desiredagricultural plants grown for harvest and consumption. In this example,the agricultural object #40 detected as a bud at a given moment in timecan be labeled as agricultural object #40 as a first label of bud #40.As time moves forward in a season, the agricultural objects on the farmor orchard, including agricultural object #40 as bud #40 can naturallyturn into a flower. At this moment, if and when the agricultural object#40 turns into a flower, the agricultural treatment system can label theagricultural object #40 as a flower #40, associating the bud #40 withflower #40 such that the bud #40 and flower #40 are the sameagricultural object #40 in the real world. Not all agricultural objectsdetected of the same plant may experience the same stages of growth orcontinue to keep growing. Some agricultural objects may even be removed,for example by thinning. For example, some plants can be thinned suchthat one or more agricultural objects growing from a single tree or stemcan be removed or treated such that the next growth stage will nothappen. In this instance, the agricultural treatment system can stilldetect that a uniquely identified real world agricultural object did notreach, or stopped, at a certain growth stage having unique physicalfeatures for a unique object label, or that the agricultural objectdetected previously is now gone and cannot be detected by theagricultural treatment system due to thinning or other method ofremoving the agricultural object so that neighboring agriculturalobjects can continue to grow as desired.

The description of buds, blooms, flowers, fruitlets, and otheragricultural objects and stages of growth of such agricultural objectsdiscussed are only meant to be an example series of objects that can bedetected by a treatment system, such as agricultural treatment systemdetecting fruits and objects associated with the stages of growth offruits on fruit trees, and not meant to be limiting only to the specificexample described above.

For example, as illustrated in FIGS. 12A and 12B, an image depicting anagricultural environment including a fruit tree having one or morespurs, one or more branches and stems, one or more laterals, and one ormore potential crops growing on the one or more laterals. At the momentan agricultural observation and treatment system, or an agriculturaltreatment system, described throughout this disclosure, has observed andlabelled each identifiable feature of the image, including detectingagricultural objects and labelling its growth stage, detecting andlabelling landmarks including orientations of portions of the treegrowing including configurations of leaves, branches, physical manmadematerials that can be detected in the image, or other objects and sightsof interest in the image that is not a potential crop, the agriculturaltreatment system can detect that not all identified objects in the imageinclude agricultural objects of the same growth stage. For example, someagricultural objects detected are labelled as buds, some as blossoms,and some as fruitlets. Each of these labels are of agricultural objectsof interest to observe and potentially treat, but not necessarilytreated the same way depending on the growth stage. The agriculturaltreatment system can then determine treatment parameters in real time totreat each individually labelled agricultural object with differenttreatment parameters, or refrain from treating an agricultural object.For example, if a first labelled growth stage does not need to betreated, a second growth stage does need to be treated at least once, athird growth stage does not need to be treated, the agriculturaltreatment system can scan through a path, capture images such as the onedepicted in image 1300 a, and treat only the second labelled growthstage. In this specific example, a blossom can be treated withartificial pollen. The agricultural treatment system can detect thatthere are buds that have not yet blossomed, and fruitlets that havealready grown after the blossom, so the agricultural treatment systemwill refrain from treating the agricultural object 1302 and only treatagricultural objects labelled with the same label as that ofagricultural object 1310. In one example, the agricultural treatmentsystem can select different treatment mixtures and emit differenttreatment projectiles by volume, concentration, mixture type, as well asthe type of emission which can be a single spray projectile, a sprayprojectile with a large surface area travelling towards the surface ofthe agricultural object, or a mist or fog type spray treatment. In thisexample, multiple identified agricultural objects at different growthstages can require a treatment with different parameters. Instead ofrefraining from treating one type of agricultural object at a certaingrowth stage while treating other agricultural objects having thedesired growth stage for a particular trial, the agricultural treatmentsystem can treat multiple types of growth stages of agricultural objectsgrowing on the same tree simultaneously by selecting and receiving adesired chemical mixture for treatment in real time.

The agricultural treatment system can observe, by running a plurality oftrials, such that one trial is a sequence of capturing sensor data,depositing treatments, or a combination thereof, along each row of cropson a farm or orchard one time and captures sensor data and has theopportunity to deposit a treatment for each crop or agricultural objectdetected. For example, a trial run, where the agricultural treatmentsystem scans through a farm of one or more row crops in one cycle, canbe performed once a day, or twice a day, once during daytime and onceduring night time in a calendar day. For example, the agriculturaltreatment system can perform multiple trials or runs on a farm ororchard in a single day, particularly if the growth sequence of a plantis more rapid in one season or series of days over another season, suchthat the agricultural treatment system can capture more changes instages of growth by conducting more trials as well as depositingtreatments onto surfaces of desired agricultural objects morefrequently.

Additionally, each row of crops, whether each row includes the sameplant or of different plant types, for example planted in an alternatingpatter, can include a plurality of plants that have one or more budsexposed, a plurality of plants that have one or more blossoms exposed, aplurality of plants that have one or more fruitlets exposed fortreatment, or a combination of plants having a combination of buds,blossoms, fruitlets, etc., exposed at the same time on a single row. Inthis example, different agricultural objects at different stages willrequire different treatments at different volumes and frequencies. Theagricultural treatment system can identify the particular stage ofgrowth of each uniquely identified agricultural object mapped in the rowof plurality of agricultural objects and give a label or identifier toeach agricultural object based on its different and unique growth stage.The agricultural treatment system can then identify the appropriate ordesired treatment parameters including treatment chemical mixture,density and concentration, whether a treatment is needed at all for theparticular trial if the agricultural treatment system can identify thata particular agricultural object was already previously treated with atreatment deposition such that another treatment at a given trial can betoo close in time for the same treatment to be applied again to the sameunique agricultural object in the geographic boundary, depending on thestage of growth detected.

The agricultural treatment system can detect a first agricultural objectof a plurality of agricultural objects in a row of plants inside ageographic boundary such as a farm or orchard. The agriculturaltreatment system can determine that the first agricultural object isdifferent from a plurality of other agricultural objects by type or thatthe first agricultural object detected is among a plurality of the sametype of agricultural objects as that of the first and can be indexed bya unique identifier to identify the particular object in the real worldso that each unit or object in the real world of the same agriculturalobject type can be indexed and located in the geographic boundary. Forexample, a first agricultural object of a plurality of agriculturalobjects of the same plant type of the same tree or root can beidentified on an orchard or row farm. The first agricultural object canbe assigned and indexed as agricultural object #400 with a uniqueidentifier that identifies its object type, such as a type of crop, andits location in the geographic boundary and time that the identifier wasassigned to the first agricultural object. The agricultural treatmentsystem can also assign a label of the first agricultural object based onthe size, shape, color, texture, etc., with a first label, for examplefruitlet #400 if the detected first object is a fruitlet of a crop.Because different stages of growth of a same desired plant or crop canrequire a different type, frequency, volume, or a combination thereof oftreatment, the agricultural treatment system can determine treatmentparameters, in real time upon detecting the first agricultural object inspace and the growth stage of the first agricultural object eitherdetermined in real time or determined based on the growth stage detectedon a previous trial. For example, if the first agricultural objectdetected at a particular time is a flower or cluster of flowers, theagricultural treatment system can label the flower detected in one ormore images as a flower and determine treatment parameters for theflower. The agricultural treatment system can apply the same type,mixture, amount, and frequency of a treatment to the each of the sameagricultural object type detected at the same growth stage along thesame row of plants. The agricultural treatment system can apply adifferent type, mixture, amount, and frequency of a treatment to each ofthe same agricultural object type detected at a different growth stagealong the same row of plants. In one example, the different growth stageof the plant or portion of a plant can vary by days or hours in one partof a season and vary by weeks or months in another part of a season. Forexample, a tree of a plurality of trees in a row of the same type ofplant yielding the same crop can have portions of the tree, for exampleshoots, spurs, stems, laterals, or branches with nodes, clusters, buds,or other objects for crops, growing at different stages. A bud for apotential crop can form on one portion of the tree or lateral whileother portions of the tree do not have buds. At this stage, theagricultural treatment system can identify the portions of the tree thatdo have buds and perform any treatment including chemical treatment orlight treatment (e.g., laser) that is appropriate for treating a bud ofa certain plant. In another example, a tree can have some laterals thathave blossoms and some laterals that only have buds. In this example,the blossoms may be treated with a certain treatment and the buds may betreated with a different type of treatment as that of the treatment forblossoms. The agricultural treatment system can identify and distinguishbetween the various agricultural objects in space having differentlabels based on their growth stage and apply a treatment appropriate foreach unique agricultural object identified and located in the realworld.

The agricultural treatment system can also identify and index atreatment history on each unique agricultural object identified in spaceof a geographic boundary. For example, one or more buds detected onlaterals of a tree can be treated with a certain type of chemical orlight treatment. At this point in time, certain laterals will havelaterals that have yet to form buds. As time moves forward and theagricultural treatment system engages the row of crops for treatment,the laterals that have yet to form buds may now have buds. Additionally,the previously detected buds, that have been treated have not yet turnedonto a flower, or even further stage of a bud that may require anadditional treatment or different type of treatment. In this example,because the agricultural treatment system has indexed each agriculturalobject detected by its growth stage, with a label across time, andtimestamp for each time the agricultural object was detected and itsspecific growth stage and image of the growth stage labeled, theagricultural treatment system can determine which agricultural objectsin the row requires treatment and which agricultural objects in the rowdoes not require a treatment, either because it was already treated in aprevious trial and does not need a treatment every trial, or has notreached a later growth stage detected that will require a differenttype, frequency, mixture, etc., of treatment.

As with the earlier example, the first real-world agricultural object#400, having one or more images, a location, and object type associatedwith object #400, based on its labelled stage of growth, for examplelabel #400, can require a first treatment having a specific treatmentmixture, type, volume, concentration, etc., and projectile emissionstrength. A second agricultural object #401, in proximity toagricultural object #400, for example, being a potential object forharvest of the same tree as that of agricultural object #400, having oneor more images, a location, and object type associate with theagricultural object #401, based on its label #401, can require a secondtreatment having a specific treatment mixture, type, volume,concentration, etc., and projectile emission strength. The difference intreatment parameters such as the mixture, type, volume, concentration,strength of the projectile emitted, or a combination thereof, orabstaining from depositing a treatment at all for the particular trialrun conducted by the agricultural treatment system, can be based on thedifferent growth stage detected, even if the agricultural object is ofthe same type. In one example, different treatment parameters can beapplied to a row of crops with the same type of plant but portions ofthe plant, such as various laterals can have agricultural objectsgrowing on the laterals at different stages and require differenttreatments. Different treatment parameters can be applied to a row ofcrops with different plants in the row, for example with alternatingcrops. In one example, the same treatments with the same treatmentparameters can be applied to the same row of crops of each agriculturalobject having the same or similar stage of growth. In one example, adifferent concentration or frequencies of treatments deposited can beapplied to a row of crops of either the same plant of different plantsat different stages of growth. For example, a first bloom of a lateralcan require one deposition of chemical-#1 with a certain mixture,concentration, volume, etc. Other portions of the tree or other lateralsmay not have yet experienced a bloom from the buds so only the firstbloom will receive a treatment of chemical-#1. At a later time, and morespecifically, at later trial performed by the agricultural treatmentsystem, other laterals will experience a bloom, such as a second bloom.In one example, it would be desirable for the second bloom to receive asingle treatment of chemical-#1. Since the first bloom already receiveda treatment of chemical-#1 and for this particular example growth stageof this particular plant type, this example first bloom only requiresone treatment of chemical-#1, the agricultural treatment system candetect that the agricultural object of the second bloom requires atreatment of chemical-#1 of a specified volume, concentration, strengthof projectile and apply the treatment of chemical-#1, and detect thatthe agricultural object of the first bloom does not need a treatment atall for this trial.

For example, a treatment module, with one or more image sensors in realtime, can sense and detect both object 1302, for example a fruitlet, aswell as object 1308, which is a landmark. In one example, a landmark cana specific pattern detected and indexed in the geographic scene, forexample of a tree pattern branching into two branches. As the vehiclemoves forward in a row of an orchard, the treatment module's imagesensors translates and moves relative to the tree, for example fromright to left, and scans the tree illustrated in 1300 a in real time. Asthe treatment module, with its compute unit, detects objects in the treewhile the vehicle is moving, the treatment module can track both theobject 1302 for targeting, tracking, and treating via the treatmentunit, as well as track the landmark object 1308 to generate and obtain ahigher accuracy motion estimation. In this example, the detecting, vianeural network or computer vision methods such as template matching,correspondence matching, homography estimation, etc. or a combinationthereof, and tracking of the target object can be done for treatment butcan also be tracked for the motion estimation of the treatment module,and by extension the treatment unit and its treatment head, itself. Theaddition of tracking other objects, including other target objects,landmarks that are real world objects, or real-world objects or salientpoints in an image that can be tracked, can add accuracy for poseestimation of the treatment module which reduces error or misalignmentof treatment when the treatment module's compute unit sends instructionsto the treatment unit for treatment.

In one example, the agricultural treatment system can determine thatdifferent chemical concentrations of a chemical mixture are required fordifferent growth stages of the same plant on a row of plants. In oneexample, the agricultural treatment system can determine that differentchemical concentrations of a chemical mixture are required for differentgrowth stages of different plants planted on a same row on a farm ororchard. In another example, the agricultural treatment system candetermine that only certain growth stages of agricultural objects detectrequire a deposition of a particular treatment, and that otheragricultural objects detected require a deposition of a differenttreatment, or no treatment, depending on the stage of growth andtreatment history of the particular agricultural objected detected inthe real world. In one example, a row of plants can have lateralssupporting different agricultural objects, or the same agriculturalobjects with different stages of growth and different treatmenthistories, such that different treatments are desired for each uniqueagricultural object in the row. The chemical selection unit can mixdifferent treatment mixtures and concentrations in real time for theagricultural treatment system to accommodate the different requirementsof treatments in real time while performing a trial in a particular rowof plants. Additionally, the agricultural treatment system canaccommodate for applying different treatments to different agriculturalobjects of different plants in a single row, or other configuration, ofcrops.

Thus, the agricultural treatment system can, in real time, scan withsensors for agricultural objects and its stage of growth and real-worldlocation in the row, determine whether to apply a particular treatmentbased on stage of growth detected and the particular agriculturalobject's treatment history.

In one example, the agricultural observation and treatment system can beconfigured to detect objects in real time as image or lidar sensors arereceiving image capture data. The treatment system can, in real time,detect objects in a given image, determine the real-world location ofthe object, instruct the treatment unit to perform an action, detect theaction (discussed below), and index the action as well as the detectionof the object into a database. Additionally, the treatment system, at aserver or edge computing device offline, can detect objects in a givenimage, spray projectiles, spray action, spot of splat detections, andindex the object detections and spray action detections. In one example,the agricultural observation and treatment system can perform and usevarious techniques and compute algorithms for perform the objectdetections including computer vision techniques, machine learning ormachine learning assisted techniques, or a combination thereof inmultiple sequences and layers such that one algorithm partitions a givenimage and a second algorithm can analyze the partitioned image forobjects or landmarks.

In one example, a machine learning model, embedded in one or morecompute units of the agricultural observation and treatment systemonboard a vehicle, can perform various machine learning algorithms todetect objects, including object detection including feature detection,extraction and classification, image classification, instanceclassification and segmentation, semantic segmentation, superpixelsegmentation, bounding box object detections, and other techniques toanalyze a given image for detecting features within the image. In oneexample, multiple techniques can be used at different layers or portionsof the image to better classify and more efficiently use computerresources on images. Additionally, pixel segmentation can be performedto partition colors in an image without specific knowledge of objects.For, example, for row crop farming, a system can perform colorsegmentation on a given image to partition detected pixels associatedwith a desired color from any other pixels into two groups, such as thecolor segmented pixels and background pixels. For example, a system canbe configured to analyze frames by detecting vegetation, which can be aform of green or purple color from background objects, such as terrain,dirt, ground, bed, gravel, rocks, etc. In one example, the colorsegmentation itself can be performed by a machine learning modelconfigured to detect a specific type of color in each pixel ingested byan image sensor. In another example, the color segmentation can bemanually predefined as pixels ranging between a specific range of acolor format. For example, vegetation algorithm can be configured toanalyze a given frame to partition any pixels having attributes of thecolor “green” form a Bayer filter. In another example, the algorithm canbe configured to detect attributes of “green” under any color modelwhere “green” is defined. For example, a numeric representation of RGBcolor being (r,g,b) where the value of g>0 in any digital number-bit perchannel. The algorithm can itself be a machine learning algorithm todetect “green” or a different color that are of interest.

In one example, machine learning and other various computer visionalgorithms can be configured to draw bounding boxes to label portions ofimages with objects of interest from backgrounds of images, maskingfunctions to separate background and regions of interest or objects ofinterest in a given image or portion of an image or between two imageswhere one image is a first frame and another image is a subsequent framecaptured by the same image sensor at different times, perform semanticsegmentation to all pixels or a region of pixels of an given image frameto classify each pixel as part of one or more different target objects,other objects of interest, or background and associate its specificlocation in space relative to the a component of the treatment systemand the vehicle supporting the treatment system.

Multiple techniques can be performed in layers to the same or portionsof the same image. For example, a computer vision technique or machinelearning technique can be first applied to an image to perform colorsegmentation. Once a given image is detected and pixels related to adesired or target color is segmented, the separate machine learningalgorithm or computer vision algorithm can be applied to the segmentedimage, for example to an object detection algorithm to draw boundingboxes around the segmented image containing weeds and containing crops.In another example, an object detection algorithm can be applied to theentire image to draw bounding boxes around plants of interest, such ascrops and weeds. Once the image has bounding box detections draw aroundeach of the detected crop or weed objects in the image, a colorsegmentation algorithm can be applied to just those bounding boxes toseparate pixels bounded by the box that are of a target color, such asgreen, and those pixels that are considered background. This method canallow a system to more accurately determine which pixels are associatedwith objects in the real world, such that an image with contours andoutlines of a specific object detected in the image, such as a leaf, canbe a more accurate depiction of the leaf, and therefore more accuratelytarget the leaf in the real world, than drawing a rectangular box arounda leaf where the system determines that any portion inside the boundedrectangular box is associated with the object “leaf”. The example aboveis just one of many examples, configurations, orders, layers, andalgorithms, that can be deployed to analyze a given image for betterunderstanding of objects, that is improved feature detection, performedeither online in the field in real time, or offline at a server forother uses, such as creating a time lapse visualization, mapping theobject, generating key frames with detections for indexing and storage,diagnosing and improving machine learning models, etc.

In one example, detecting a plurality of agricultural objects and/orlandmarks can be used to perform variations of consensus classification.For example, multiple detections of the same agricultural object and/orlandmark can be performed to eliminate or reduce false positives orfalse negatives of object detection. While a machine learning model willbe tasked to identify individual objects and landmarks, the closeness ofan object to another object in a single frame can also be accounted foran considered by the machine learning detector for detecting an object.For example, if in a first frame, the machine learning detector detectsa target object as well as a plurality of nearby target objects, otheragricultural objects, or landmarks, but then in subsequent frames, whilethe vehicle has not moved enough such that the location where the MLdetector has detected a target object has not moved out of the nextframe, does not detect that same target object, but does detect all ofthe other nearby target objects, other agricultural objects andlandmarks detected in the first frame, the compute unit can determinethat the first frame may have had a false positive and flag the framefor review and labelling, at a later time on board the vehicle for ahuman to label, or offline, without instructing the treatment unit toperform an action at the location in the real world where the systemdetected a target object to treat based on the first frame.

FIG. 14 illustrates an example diagram 1600 for ingesting an image,performing various computer vision and machine learning algorithms ontovarious portions or layers of the image to extract and detect featuresof the image.

As discussed above, multiple techniques can be performed in layers tothe same or portions of the same image. For example, an image 1610 canbe acquired by an image capture device and loaded onto a local computeunit of a local modular treatment module. For illustration purposesonly, the image 1610 captured can be an image of a row crop farm havingone or more beds 1612 supporting a plurality of crops, such as carrots,and weeds, and one or more furrows or tracks 1614 for a vehicle's wheelsto run through as a vehicle passes the row. One or more embedded machinelearning algorithms and computer vision algorithms in the compute unit,or accessible by the compute unit in real time via the cloud or edgecompute device containing the machine learning algorithm and computervision algorithm, such as computer vision algorithm 1620 and machinelearning algorithm 1630 can be used to partition the image 1610 intoanalyzed images with features extracted, with the goal of accuratelydetecting objects in the given image 1610. For example, the firstcomputer vision algorithm 1620 configured to separate beds and furrowscan be applied to the analyze and segment classify the image 1610 withportions of the image related to beds such as partitioned image 1613with portions of the image related to furrows such as partitionedbackground image 1615. One purpose of deploying this technique is tothat the treatment module does not have to run a machine learningdetector on the entire image 1610, but only on portions where object ofinterest may be. The partitioning of beds and furrows, as is thepartitioning of green and background, are just many examples ofperforming a plurality of computer vision and machine learningtechniques to an image to reduce computation load while generatingaccurate detections of features in the real world. Next, the system willhave generated a partitioned image 1616 having pixels associated withbeds and pixels associated with furrows such as that of partitionedimage 1613 and partitioned background image 1615. The machine learningalgorithm 1630, which for example can be a machine learning algorithm todetect plant objects of interest, such as crop plants and variousspecies of weeds, can be implemented to further analyze the image 1610or the partitioned image 1616, and only the portion of the image 1610that is partitioned image 1613, and not the partitioned image 1615. Thiswould allow the ML detector or machine learning algorithm 1630 toanalyze fewer pixels or tiles of pixels, and reduce the load on thesystem, while the system having a high probability that the machinelearning detector is scanning the most important areas of the image1610. In this example, the detector would run detections on only aportion of the partitioned image 1616, for example a portion of thepartitioned image 1613, such as a patch 1632 of the partitioned image1613. The treatment system can then draw bounding boxes, semanticallyclassify, or perform various machine learning methods deployed bymachine learning algorithm 1630, for example detect objects and drawbounding boxes, and generated a machine labelled or machine detectedimage 1642, which is a labeled of image of a portion of the originalintake image 1610. The agricultural observation and treatment system canthen use those detections to determine which detections are targetobjects to treat, target the objects in the real world, track thedetected objects in subsequent frames, and perform a treatment action tothe detected object in the real world. Additionally, using multiplelayers of computer vision and machine learning algorithms to optimizethe computing load onto a compute unit can be performed to improveVSLAM. For example, vegetation segmentation can be performed to detectgreen objects. In the VSLAM pipeline for matching keypoints from frameto subsequent frames by the same sensor, the compute module or computeunit associated with the sensors receiving the images, can determinethat points associated with green objects are real objects in the worldthat are stationary and can be tracked via VSLAM by sensors and computeunits of each component treatment module for local pose estimation. Thiswould allow the VSLAM algorithm analyze keypoints, keypoints in thiscase being points related to corners or contours or edges of greenobjects, with higher confidence that the keypoints generated andanalyzed are higher quality than that of arbitrary salient points,compared to that of known objects, such as objects corresponding togreen pixels, since the system will know beforehand that green pixelsare of vegetation, which are physical objects in space that arestationary and are of similar size and topography as that of targetobjects for treatment that will be tracked.

FIG. 15A illustrates an example method 1700 that may be performed bysome example systems or subsystems described in this disclosure eitheronline, that is onboard a vehicle supporting one or more modularagricultural observation and treatment systems, subsystems, orcomponents of systems, or offline, that is at one or more servers oredge compute devices.

At step 1710, the agricultural observation and treatment system caninitialize the treatment system. At step 1720, the agriculturalobservation and treatment system can obtain a first image having one ormore unique regions of interest. For example, the regions of interestcan be regions or portions of images that are specific to a specificgeographic boundary such as a row crop farm or an orchard. For example,images where there are tree trunks, images where a substantial portionof the image are either beds or troughs or furrows, images where objectsof interest have a certain color and every other portion of the imagecan be background. At step 1730, the agricultural observation andtreatment system can identify the one or more unique regions of interestand one or more regions of background. At step 1740, the agriculturalobservation and treatment system can partition the first image into theone or more unique regions of interest and the one or more regions ofbackground of the first image. At step 1750, the agriculturalobservation and treatment system can identify a first region of interestamong the regions of interest. At step 1760, the agriculturalobservation and treatment system can detect one or more objects in thefirst region of interest. At step 1770, the agricultural observation andtreatment system can the agricultural observation and treatment systemcan determine a real-world location of a first object of the one or moreobjects based on a location of the first object detected in the firstimage. At step 1780, the agricultural observation and treatment systemcan determine and prepare one or more actions associated with the firstobject in the real world. At step 1790, the agricultural observation andtreatment system can send instructions to activate actuators. The systemcan repeat steps 1760 to detect a second object detected in the firstregion and prepare treatment actions associated with the second object.Once all objects of interest are accounted for in the first region ofinterest, the system can detect objects in a second region of interestfor treatment, or partition the image for a second region of interest.

Additionally, as illustrated in FIG. 15B, at step 1782, the agriculturalobservation and treatment system can identify a second region ofinterest. At step 1784, the agricultural observation and treatmentsystem can detect one or more objects in the second region of interest.At step 1786, the agricultural observation and treatment system candetermine a real-world location of a second object based on a locationof the second object detected in the second region of interest in thefirst image. At step 1788, the agricultural observation and treatmentsystem can determine and prepare one or more actions associated with thesecond object.

FIG. 16 is a diagram 1800 capturing an action performed by anobservation and treatment system. In this example, an image capturedevice can receive a constant stream of images of a local scene havingone or more agricultural objects in the scene. Once a target object isdetected, targeted, and tracked, the system will instruct a treatmentunit to activate and emit a liquid projectile or a beam of light onto asurface of the target object. This action will take a length of time torelease from the treatment unit to exiting the treatment head, travel inspace, hit the target if accurately targeted and emission parameters,such as dwell time which is the amount of time the nozzle head ispointed at the target object while the nozzle head is on a movingvehicle, pressure release time which is when a pressure actuator such asa capacitor or solenoid valve opens and closes and allows pressurizedfluid to release from the valve and through the nozzle head, nozzleorifice size, and other parameters, and create splash, splat, or afootprint on the ground for row crops where plants, or target plants,are growing out from the ground.

In this example, the image capture system can capture and trace theliquid projectile itself, for example fluid projectile 1830. Because theprojectile is a fluid, it may not flow it an exact straight line.Additionally, the projectile can be comprised of smaller liquid droplets1850. The compute unit and image sensors can detect the beam tracedirectly from detecting the projectile 1830 and its smaller droplets1850 as the liquid leaves the treatment unit. Additionally, a laser witha laser beam 1840 can be pointed at the intended target object 1820 forthe system to detect both the laser beam and trace the projectile beamto determine whether there was a hit, and if there was any error ordiscrepancy form the desired projectile hit location to the actualtrajectory of the projectile.

FIG. 17A and FIG. 17B illustrate an example of spray detection, beamdetection, or spray projectile detection. In these diagrams 1802 and1803, one or more image sensors is scanning a local scene comprising aplurality of plants 1872 including target plants for treatment and cropplants for observation and indexing. As the sensor scans the scene whilea vehicle supporting the sensor is moving in a lateral direction, thesensor will capture one or more image frames in sequence from one toanother illustrated in image frames 1862, 1864, and 1866 where imageframe 1864 and 1866 are frames captured by a sensor that captured imageframe 1862 subsequently, but not necessarily the immediate next framecaptured by the image sensor. During the capturing of images, ifcomponent treatment system having sensors and treatment units sendsinstructions to the treatment unit to perform a spray action, such asemit a fluid projectile, the image sensors would capture the sprayaction as it comes into the frame and then eventually disappears as theprojectile is fully splashed onto the surface of the intended target orground. In such example, the spray projectile, such as projectile 1875,can be detected and indexed by the image sensors and the treatmentsystem, as well as the splat area 1877 after the spray has completed.The system can detect the splat size and location.

In one example, the detection of the spray can be performed by variouscomputer vision techniques including spray segmentation, colorsegmentation, object detection and segmentation, statistical analysisincluding line fitting, homography estimation, or estimation of ahomography matrix, or a combination thereof. For example, thedifferences between frame 1862 and frame 1864 can be the presence of aspray and the lack of presence of a spray. The rest being the samefeatures in each image. In one example, homography estimation is used toaccount for change in space across a common plane, such as a bed of arow crop farm. A homography matrix can be used to estimate how muchmovement in space from a first frame to a subsequent frame. The imageswill be slightly misaligned from each other due to the camera being on amoving vehicle while the first frame 1862 is captured and a subsequentframe 1864 is captured. The discrepancy in in the frames caused by themotion of the camera can be accounted for with homography estimation,given that the two frames are likely looking at the same plane of equaldistance from the camera from the first frame 1862 to the subsequentframe 1864, at a later time but not necessarily the exact next framecaptured by the image capture device. The difference in the two images,other than the discrepancy which can be accounted for by homographyestimation, would be the presence of the spray, which can be generatedby comparing the two frames and performing spray segmentation, that isthe pixels in frame 1864 that has the spray projectile 1875 capturedcompared to the pixels in frame 1862 that do not have a projectiledetected. In this case, one or more statistical and image analysistechniques, including line fitting, and masking function to determinethat the pixels detected in frame 1864 but not detected in frame 1862 isa spray projectile. Since spray projectiles are likely line shaped, thepixels related to the spray can be line fitted. Other image differentialtechniques can be applied to detect the spray beam including outlierrejection and using priors for masking outliers. The priors can be anexpected region such as that outline by predicted spray path 1876. Inone example, the difference in pixels profiles detected from a firstframe to a subsequent frame, accounting for homography estimation due tochanges in translation of the image sensor, can generate a projectilesegmentation. Similar techniques can be used to detect the splat or spotdetection of the spray outcome onto the surface of the target andground, for example, seeing the color of the ground and target plantchange from unsprayed to sprayed. For example, a liquid projectilehitting a target plant will morph from a projectile having a smallcross-sectional diameter to a flat area covering a portion of the dirtor leaf. In this example a liquid projectile may change the color of thedirt surrounding a plant, due to dry dirt turning wet from the liquidprojectile hitting the dirt. In this case, the image sensors can detecta color change in the ground and determine that a splat is detected andthat a detect target object for treatment has been treated, and loggedor indexed by the treatment system. In one example, a stereo pair ofcameras can detect sprays in each camera and associated with each otherto fit a 3D line such that the system can detect and index a spray inthe real world with 3D coordinates.

FIG. 17B illustrates a diagram 1803 to determine spray accuracy andspray health, spray health being whether external factors outside orcorrectly detecting target object and lining the treatment head onto thetarget object and tracking it as the target object moves away from thetreatment unit, since the treatment unit is on a moving vehicle, a prioror predicted spray path 1876 can be generated. For example, a sensor,disposed on a moving vehicle, can receive an image frame 1862 having aplurality of crop objects and target objects, including detected targetobject 1872. The treatment system will target the target object 1872,track the object 1872 in subsequent frames, such as that of frame 1862,and emit a projectile onto target 1872. In one example, due to externalfactors not necessarily related to computer vision, such as portions ofthe treatment unit no longer calibrated to the image sensor, such thattargeting at a specific location in the real world from a detection inthe image frame may result in a misalignment of the line of sight of thetreatment head. For example, the treatment system, given frame 1862 or1862, may target the target object 1872 at the correct real-worldlocation, but in doing so and instructing the treatment head to aim itsnozzle to target object 1872 in the real world may in fact be targetinga location 1879 or 1878, or an incorrect location or misaligned locationin the real world, that the treatment systems image sensor wouldcapture. In this case, to quality check the spray targeting and sprayaction, the treatment system can predetermine a predicted spray path1876 and perform spray segmentation and other computer vision andmachine learning techniques described above only in the portion of theimage, and therefore compare pixels related to the images contained inthe region defined by the predicted spray path 1876. If the detection isnot good enough, such as the line cannot be fitted, the system candetermine that the spray did not happen, or happened but not at theintended target. Alternatively, the system can perform spraysegmentation on the spray that was detected, whether within thepredicted spray path 1876 or not, and determine whether the end of thespray or the splat detected lines up with the intended target. Thus,seeing where a target object should have been sprayed, and/or shouldhave had a splat detected, and where the actual spray profile wasdetected, including 3D location, and where the spray splat was detected,can be used to evaluate the specific spray health of that particularspray, and whether intrinsic or extrinsic adjustments needs to be made.The adjustments can be accounting for wind that may have moved thespray, the speed of the vehicle not being accounted for properly as thesystem tracks an object from frame to frame, or mechanical defects suchthat the intended target and the line of sight after sending the correctinstructions to orient the treatment head of the treatment unit aremisaligned. Upon detecting an inaccurate or incorrect spray projectile,one or more of the discussed defects can be accounted for in real timeand a second projectile can be reapplied on to the target object andtracked again for trajectory evaluation and its spray health andaccuracy.

FIG. 17C illustrates an example method 1804 that may be performed bysome example systems or subsystems described in this disclosure eitheronline, that is onboard a vehicle supporting one or more modularagricultural observation and treatment systems, subsystems, orcomponents of systems, or offline, that is at one or more servers oredge compute devices.

For example, at step 1806, the observation and treatment system orserver can identify a first object for treatment. In this example, theobservation and treatment system or a server is analyzing theperformance of the online observation and treatment system during itslatest run, in a location such as an agricultural geographic boundary.The system, online or at a server, can identify each treatment performedor instructed to be performed on the geographic boundary forverification, indexing, and adding the verification to each of theidentified target object's treatment history. For example, a treatmentsystem may have identified and initialized a few thousand or a fewhundred thousand actions performed in a single run at a field, orchard,or farm, and a server is analyzing the treatment accuracy and efficacyof each of the actions performed on the field in that particular run. Atstep 1808, the observation and treatment system or server can determinea treatment unit activation for each of the objects for treatment. Inthis optional step, the system or server can determine treatment actionsbased on the treatment performed and logged previously in real timewhile the observation and treatment system was on the field performingdetection objects and performing treatments. In this example, the serverdoes not have to identify every frame captured and determine whichobjects detected were treated for second time, but instead can analyzeonly those frames captured by image capture devices where each onlineand onboard compute unit has already detected. In one example, thedetermining of treatment activation can include the treatment parameterssuch as desired spray size, volume, concentration, mixture of spraycontent, spray time of flight, etc.

At step 1812, the observation and treatment system or server can detecta first emission pattern. This can be done with techniques describedabove as well as image correspondence from a previous frame and asubsequent frame to detect a projectile.

At step 1813, the observation and treatment system or server can indexthe first emission pattern. This can be stored as a 3D vector, or a 2Dor 3D model of the full 3D profile with shape and orientation mappedinto a virtual scene.

At step 1814, the observation and treatment system or server can detecta first treatment pattern. This can be the splat detection from colorchange in dirt from a first frame to a subsequent frame, performed bysimilar methods described above.

At step 1815, the observation and treatment system or server can indexthe first treatment pattern.

At step 1816, the observation and treatment system or server candetermine and index the first object as treated. For visualizationpurposes, a target object that has not been accurately treated can havea bounding box with a dotted line indicating a detection of the objectitself but no detection of a spray onto that target object. And once aspray or treatment is detected, by the projectile or the splatdetection, the dotted line can convert to a solid line, as illustratedin diagram 1803 of FIG. 17B.

As illustrated in FIGS. 17D and 17E, each spray projectile and splatdetections can be indexed and visually displayed in a user interface.The 2D or 3D models 1880 a, 1880 b, and 1880 c of each target object1872, spray projectile 1875, and splash 1877 onto a surface of theground and target object. Additionally, the 3D models can besuperimposed on each other to reconstruct the spray action from thetargeting of the target object, to the spraying of the target object, tothe splash made and splat detected as illustrated in model 1880 d ofdiagram 1806 of superimposed model 1882.

FIGS. 18-21 illustrate various examples of performing agriculturalobservation, digitizing a geographic boundary, building a map of eachindividual agricultural object or crop detected and associating capturedimages of agricultural objects from one moment in time to another todigitize and map a farm with location and image history of eachagricultural object detected, targeting and tracking objects, andtreating each individual agricultural object.

The description of buds, blooms, flowers, fruitlets, and otheragricultural objects and stages of growth of such agricultural objectsdiscussed are only meant to be an example series of objects that can bedetected by a treatment system, detecting fruits and objects associatedwith the stages of growth of fruits on fruit trees, and not meant to belimiting only to the specific example described above. For example,agricultural objects can include larger objects or portions of a treethat are part of supporting a crop can be detected, classified, andlabelled for spraying including spurs, shoots, stems, laterals, othernodes, fruiting clusters, leaves, or other portions of a tree. Differenttypes of plants can be treated by the treatment system including generalplants for crops, specialty crops, including fruits, vegetables, nuts,flowers, herbs, foliage, etc. The agricultural treatment systemsdescribed in this disclosure can be performed in geographic boundariestypically appropriate for a robotic vision and treatment depositionsystem for observing, treating, harvesting, or a combination thereof, ofcrops such as farms, orchards, greenhouses, nurseries, or otherregionally and topographically bounded locations for agronomy andagriculture, horticulture, floriculture, hydroculture, hydroponics,aquaponics, aeroponics, soil science and soil agronomy, pedology, etc.

FIG. 18 illustrates a vehicle having coordinates associated withrotational movement including that of roll about an X axis, pitch abouta Y axis, and yaw about a Z axis, as well as translational coordinatesassociated with lateral movement including an X, Y, and Z position in ageographic boundary. The vehicle 2110, illustrated in FIG. 18 can movewith at least 6 degrees of freedom. Additionally, the treatment unit2113 of the treatment system 2112 can also have coordinates associatedwith rotational movement including that of roll about an X axis, pitchabout a Y axis, and yaw about a Z axis, as well as translationalcoordinates associated with lateral movement including an X, Y, and Zposition in a geographic boundary. This can include rotating and movinga gimbal assembly of the treatment unit 1653 to a desired pitch angle2002 and desired yaw angle 2004 when the treatment unit is configuringand orienting itself to position a nozzle or head of the treatment unit1653 at a target or aligning a line of sight towards a target foremitting a projectile.

FIG. 19A illustrates a diagram 2400 including a vehicle 2410, having oneor more sensors 2418 and other electronic devices, supporting and towingone or more treatments systems 2412. In one example, the vehicle 2410can be a tractor towing a plurality of modular treatment systems 2412.FIG. 19B illustrates the diagram 2402 with an alternate orientation ofthe treatment systems 2412 being towed by vehicle 2410.

Further illustrated in FIG. 20A, a vehicle 2410, such as a tractor isconfigured to tow one or more treatment systems 2412 along a vehicletrack 2430 having multiple lanes for the vehicle to operate a geographicboundary. Between each vehicle track 2430 are one or more rows ofagricultural objects 2432, such as plant including crop plants and weedplants, for each treatment system 2412 to scan across each row toobserve and treat individual plants growing form the ground.

As illustrated in FIG. 20B, the treatment systems 2412 can be configuredto observe a plant, soil, agricultural environment, treat a plant, soil,agricultural environment, or a combination thereof, such as treating aplant for growth, fertilizing, pollinating, protecting and treating itshealth, thinning, harvesting, or treating a plant for the removal ofunwanted plants or organisms, or stopping growth on certain identifiedplants or portions of a plant, or a combination thereof.

In one example, the treatment systems can be configured to observe andtreat soil for soil sampling and mapping of features and chemicalcompositions of soil including soil deposition, seed deposition, orfertilizer deposition, nutrient management, for both cultivated anduncultivated soil. The agricultural objects described above fortargeting and treating can be of specific patches of soil that can beidentified and features and classification labelled by a vision of thetreatment system. Each patch or region of the soil detected by thetreatment system 2412 and can be indexed and mapped with a timestampassociated with the moment the patch or region was sensed and treatmenthistory detailing each treatment applied to each patch or region of thesoil.

FIG. 21 illustrates a diagram 2408 depicting an example treatment systemhaving a plurality of component treatment modules 2444 supported by asupport member 2440 and a navigation unit 2442 (various sensor of thenavigation unit 2442 may not necessarily be enclosed in a boxillustrated in diagram 2408).

In one example, each component treatment module 2444, via its owncompute unit and image sensors, and other sensors, can perform VSLAM tocontinuously map a local environment in the agricultural scene andcontinuously generate pose estimation, such as a local pose estimationrelative to objects or landmarks detected on the ground, such as plantobjects, patterns, salient points representing unknown objects near theground, including target plant objects. Additionally, the navigationunit 2442, via its own compute unit and image sensors, GPS, IMU, andother sensors, can perform VSLAM and VIO to continuously map a globalscene and continuously generate a pose estimation, such as a global poseestimation of a global scene. The compute unit of each treatment module2444, can account for both its locally determined pose estimationrelative to objects and landmarks on the ground, and the globallydetermined pose estimation of a global scene relative to a point oforigin in an agricultural environment, such as a farm, because each ofthe treatment modules 2444 a, 2444 b, 2444 c, etc. are rigidly attachedto a support structure supported by a vehicle having sensors locatedthroughout the vehicle associated with the navigation unit such thattranslation of the vehicle, which includes a change in global poseestimation detected, will have substantially the same translation ofeach of the component treatment module, and therefore also includes achange in a global pose estimate detected and accounted for eachcomponent treatment module.

In one example, tracking multiple poses of each component treatmentsystem, being the local pose generated from local sensors to thecomponent treatment system, and global pose received form the navigationunit, can account for loss and/or inaccuracies of kinetic motion fromthe vehicle to each of the component treatment systems, particularly thecomponent treatment systems that are farther away from the vehicleitself, relative to modules supported by the vehicle that are closer tothe vehicle. This is especially apparent in farming activities whereperformance of any agricultural observation and treatment system willlikely be performed on rough topography such that movement along a pathwill cause various magnitudes in bumps, and thus change in height, alongthe path. For example, as a vehicle navigates in a rough terrain, thecomponent treatment module 2444 c will likely bump up and down moreviolently than the bumping of component treatment module 2444 a. Thus,it may be impractical for each component treatment system to onlydetermine its local pose estimation from that of the global poseestimation as movement of sensors of the navigation unit 2442, or thatof the navigation unit 2442 box itself, will be different from themovement of, for example, the component treatment module 2444 c. It thiscase, each of the component treatment modules 2444 can determine itslocal pose to more accurately detect and track targets in real time fortreatment actions.

In one example, when a compute unit of the first treatment module 2444sends instructions to each of one or more treatment units, for exampletreatment devices with one or more nozzles on a turret or gimbalmechanism, the agricultural treatment system can determine the specificpose of each of the nozzle heads at the time the treatment module,through local or global poses, global being vehicle and local being ator near each treatment module 2444, detect and identify an object andits location relative to the treatment module, as well as determine thelocation of the object in the global scene. At this point, the computeunit of the first treatment module can account for the vehicle's pose ofthe global scene, that is the global registry of a farm, the pose of thetreatment module itself relative to the local first target object, aswell as account for the last state of orientation of the treatmentunit's nozzle or emitter's line of sight relative to the treatmentmodule. This is because the vehicle, the treatment module, morespecifically the treatment module's local sensors, and the treatmentunit are all mechanically coupled in a fixed position to each other.Thus, a change in pose estimation generated by sensor signals of thevehicle itself will directly translate to the same change in poseestimation to anything physically supported by the vehicle.—However,calculating for pose at each treatment module as well as accounting forpose of the vehicle, with sensors and computer vision techniques such asperforming visual SLAM using machine learning to detect objects totrack, particularly for treatment modules that are disposed farther awayfrom the vehicle as compared to other treatment modules closer to thevehicle, and therefore the sensors of the navigation unit 2442.

In one example, multiple rows where each treatment module 2444 candetermine a pose estimation based on determine its own pose local posewith its local sensors embedded or supported by the module 2444, andthat of the vehicle's pose. Thus, each object identified, can be indexedin the real world such that if the vehicle operates on the samegeographic area in a subsequent day, or any subsequent time where abreak in operation has occurred, an object detected in the subsequenttime can be matched and associated with an object previously identifiedsince in at least both cases, the treatment system determined thelocation of each object identified in the real world, global scene, byapproximating its location in the real world with the treatment system'ssensed and determined global map of the geographic boundary and furthernarrowing down its local position relative to a point in the global mapof the geographic boundary to a specific point in the geographicboundary with each of the treatment module's sensed and determined localposition of the object relative to the vehicle and/or treatment module.

FIG. 22 illustrates an example method 2450 that may be performed by someexample systems or subsystems described in this disclosure eitheronline, that is onboard a vehicle supporting one or more modularagricultural observation and treatment systems, subsystems, orcomponents of systems, or offline, that is at one or more servers oredge compute devices. For example, at step 2452, an agriculturalobservation and treatment system can determine a first vehicle poseestimation. At step 2454, the agricultural observation and treatmentsystem can determine a first treatment module pose estimation. At step2456, the agricultural observation and treatment system can determine afirst orientation of a treatment unit. At step 2458, the agriculturalobservation and treatment system can determine a first pose estimationof the first treatment unit. This is done by accounting for the poseestimation of the first treatment module operably and rigidly connectedto the treatment unit and knowing the prior orientation, such as thefirst orientation, of a treatment head of the treatment unit, such asthe orientation of the treatment head when it last sprayed a projectile.At step 2460, the agricultural observation and treatment system candetermine a location of a first target object.

FIG. 23 is a diagram illustrating pose determination of the agriculturalobservation and treatment system, according to some examples. The figureillustrates an example of a vehicle 2721 with treatment unit 1653attached. A vehicle 2721 is shown moving along a path 2712. For example,as illustrated in FIG. 23 , a vehicle 2721, such as a tractor may beconfigured to tow one or more treatment systems along a vehicle trackhave multiple lanes for the vehicle 2721 and tow support.

If the vehicle 2721 were to remain in a stopped positioned, the system400 could spray the target object 2720 and then move onto the nexttarget object, and then stop and spray the next target object. However,the system 400 is flexibly configured to allow the continuous movementof vehicle 2721 and make adjustments to the position of the sprayinghead of the treatment unit 1653. While the vehicle 2721 is moving alongthe path, the system 400 may determine a pose for the vehicle (e.g.,Vehicle POSE₀, Vehicle POSE₁, Vehicle POSE₂ Vehicle POSE_(n)) and/or forthe treatment unit 1653 ((e.g., Unit POSE₀, Unit POSE₁, Unit POSE₂ . . .Unit POSE_(n)). For example, using onboard navigation and IMUsub-systems, the system 400 may determine multiple locations orpositions of the vehicle while the vehicle is moving along the path2712.

As noted above, the treatment unit 1653 emits at fluid at a targetobject 2720. While the vehicle 2721 is moving, the system 400 determinesa translation of Vehicle POSE, and/or the Unit POSE_(n) to a SprayPOSE_(n) such that spraying head is oriented or positioned to allow anemitted projectile fluid to spray upon a desired target object. Forexample, the system 400 may determine that a target object 2720 is to betreated. The system 400 determines a Vehicle POSE₀ and/or a TreatmentUnit POSE₀. The system 400 will provide instructions/signals to themotors of the treatment unit 1653 to adjust one or more axis (e.g.,pitch 2732, yaw 2734 and/or roll) of the spraying head. As the vehicle2721 move along the path 2712, the system 400 periodically determines nposes of the vehicle 2721 and/or the treatment unit 1653. The system 400then translates the periodically determined n poses to an n sprayinghead pose such that the treatment unit may continually spray the targetobject 2720 while the vehicle is moving. The system 100 may evaluatespeed, movement, velocity, direction, altitude, location of the vehicle2721 and/or treatment unit 1653 and determine a pose for the spray head.

As used herein, pose may be understood to be a location and orientationof an object relative to a frame of reference (e.g., x, y, z, phi,theta, psi, where x=an x-axis coordinate in a 3-dimensional coordinatesystem, y=a y-axis coordinate in a 3-dimensional coordinate system, z=az-axis coordinate in a 3-dimensional coordinate system, phi=degree orposition of roll , theta=degree or position of pitch, psi=degree orposition of yaw. For example, the agricultural treatment system maydetermine pitch, roll and yaw values of the vehicle, treatment unit andor the spraying head assembly. In some embodiments, the agriculturaltreatment system may not be configured to identify a pitch, roll and/oryaw of the vehicle, treatment unit and or spraying head. In suchinstances, the value for these variables may be set to zero.

A global frame of reference may be provided for an environment in whichthe agricultural treatment system operates. For example, a global frameof reference may be set to a particular geospatial location or the fixedreference point on a property (e.g., a corner of a barn, structure, a5g/wifi/gps tower, etc.). The point of reference may be defined as (x=0,y=0, z=0, phi=0, theta=0, psi=0). The agricultural treatment system maydetermine multiple poses of the vehicle, in relation to the point ofreference, as the vehicle moves about the environment. The pose of thevehicle may be defined as vehicle(x_(n), y_(n), z_(n), phi_(n),theta_(n), psi_(n))_(time_interval), the system may determine the nthvalues at a particular time interval being sampled at a particularsample rate (such as 200-5000 times a second). The agriculturaltreatment system may also determine a pose for a treatment unit, such astreatment_unit(x_(n), y_(n), z_(n), phi_(n), theta_(n),psi_(n))_(time_interval). The agricultural treatment system may alsodetermine a pose for a sprayer head of the treatment unit, such asspraying head (x_(n), y_(n), z_(n), phi_(n), theta_(n),psi_(n))_(time_interval). The sprayer head may have a pose relative tothe vehicle pose, may have a pose relative to the treatment unit poseand/or may have a pose directly in relation to the global frame ofreference. The agricultural treatment system may determine a finalspraying_head (x, y, z, rho, theta, psi) pose to be used to adjust thespraying head to a different position. The final pose can be relative tothe body of the treatment unit, the sprayer apparatus components, thevehicle or components thereof, and/or relative to some (0, 0, 0, 0, 0,0) location of the farm.

As described herein, the agricultural treatment system may determine thepose of the vehicle and/or treatment unit and translate the pose intocommands or instructions to adjust a spraying head assembly to emittedfluid at a desired target object. In other words, the agriculturaltreatment system may identify a target object to be sprayed, orient aspraying head assembly to the target object and then control fluidspraying operations to emit fluid from one or more fluid sources at thetarget object. The system can move along a path and make adjustments tothe spraying head assembly such that the fluid is continuously sprayedat the target object and/or detect new target object(s) to be sprayedand then position the spraying head assembly to the detected new targetobject(s).

While the above describes pose determination for a vehicle or treatmentunit, the system may determine a pose for any part or object of thesystem (e.g., a seat, the vehicle, a wheel, treatment unit, sprayinghead, spray box, turret, nozzle tip, etc.). The pose may be determinedwith one or more different sensors (e.g., a camera positioned can obtainimagery of different parts or components), and the system can estimatethe pose of the parts or components. The system may use computer vision,lidar, radar, sonar, GPS, vslam, wheel encoders, motor encoders, IMU,cameras on a spray box. In some embodiments, the system may beconfigured to determine, for example, the vehicle and multiple treatmentunits. This may be done for example where the vehicle is pulling atrailer with many spray boxes places along a frame or support that hasmany wheels. Each of the spray boxes may have different poses due to theruggedness or unevenness of the terrain.

The system may be configured to determine particular poses of thevehicle as a global pose and the treatment units as a local pose. Alocal pose for each treatment unit may be determined in relationship tothe global pose, and/or may be determined individually for the treatmentwithout relationship to the global pose. The system may use the globalpose (a.k.a. vehicle pose) as a localization method to determine itsrelationship to a real-world environment. And the system sensors mayobtain information about the real-world environment. The allows thesystem to build a high map of an agricultural environment (such as afarm). In one embodiment, the system uses a navbox and sensors todetermine the global pose.

The system may use the local pose of a particular component for certainoperations. As discussed herein, the system may determine a pose for atreatment unit and a spraying head. The system would use the local poseof these components to determine its physical relationship as betweenthe component and a target object. For example, two different treatmentunits may each have a spraying head. A first treatment unit and sprayinghead may need to spray a first target object. A second treatment unitand spraying head may need to spray a second target object. In thissituation, the system may determine a pose for each spraying unit andeach of the spraying heads, and then maneuver or orient the spray nozzleof the spraying heads toward their respective target object. In oneembodiment, the system would use the local poses to orient the sprayingheads to emit a projectile fluid at the respective target objects.

In one example, the agricultural treatment system determines multiplevehicle and/or treatment unit poses. The system evaluates a first pose,and then periodically determine subsequent poses. The system maycalculate the difference or changes of the coordinate values from thefirst pose and a subsequently obtained posed. In other words, the systemmay calculate the movement of the vehicle and/or treatment unit. Thecalculated difference or changes then may be translated to a desiredpose for the spraying head. The sample rate of the pose can beconfigured as set rate or a variable rate. For example, the system mayevaluate its pose at predetermined intervals, such as 5 milliseconds. Inan alternate configures, the system may use a variable sample rate suchthat when the vehicle speed increase, the pose determination rateincreases. For example, the sample rate for determining a pose may be 5milliseconds where the vehicle speed is from 1-3 mph, and the samplerate increase to a higher rate, such as every 2 milliseconds, where thevehicle speed is over 3 mph.

In one example, pose for the vehicle may be determined by evaluatingdata from various sources, such as onboard cameras, GPS, IMU's, wheelencoders, steering wheel encoders, LiDAR, RADAR, SONOAR, and additionalsensors that may be used to determine the vehicles position in areal-world environment. The system for example may evaluate the sensordata for example at 200-5000 hz.

In one example, pose for a treatment unit 2800 may be determined insimilar manner to the vehicle, as one or more treatment units would beconfigured in a fixed position in relationship to the vehicle. A changein pose of the vehicle may be considered to be the same change in poseof the treatment unit. The treatment unit may have one or moreprocessors and microcontrollers to monitor and determine the pose of thetreatment. The processors and microcontrollers are configured to keeptrack of treatment unit pose. The treatment unit process periodicallyrequests from a vehicle processor system pose information for thevehicle. The treatment unit process may then determine its pose by usingthe pose of the vehicle and may offset the pose of the vehicle based ona distance value from where a respective treatment unit is positionedrelative to the point of where the pose is determined for the vehicle.As discussed herein, the agricultural treatment system may includemultiple treatment units. Each of the processor of the respectivetreatment unit may determine the pose for the treatment unit. Thus, eachof the treatment units may have a unique pose relative to the determinedvehicle pose.

Each of the treatment unit processors and microcontrollers may determinea spraying head pose. As noted above, each of the treatment unitcontinually poll or request a vehicle pose from the vehicle's computersystem and may determine a treatment unit pose. The treatment unitprocessors are configured to determine and evaluate the positions of themotors via the encoders coupled to the motors. The treatment unitprocesser obtains information from the microcontroller 2875 of thetreatment unit regarding the encoder output. In other words, theencoders provide data output about motors' position and/or rotationalmovement. The microcontroller receives the encoder output data andprovides the data to the treatment unit processor. Similarly, themicrocontroller may receive instructions or data from the treatment unitprocessor, and the microcontroller in turn may provide or translate thereceived instructions or data to instructions, voltage and/or commandsto that cause the motors to rotate in one direction or the other. Theaxial rotation of the motor then causes the linkage assemblies to rotatethereby causing the spraying head to change. The treatment unitprocessor may determine a spraying head pose and provide instructions tothe microcontroller 2875 to then make adjustments to each of the motorssuch the that the spraying head is adjusted to the desired spaying headpose.

In one example, a first 3-dimensional coordinate system may be used forthe vehicle and/or treatment unit pose, and a second 3-dimensionalcoordinate system may be used for the spraying head. Changes in thefirst 3-dimensional coordinate system may be mapped to the second3-dimensional system. Distance moved in the second 3-dimensionalcoordinate system may then be calculated and the distance moved can betranslated into instructions/commands to rotate the motors by a certainamount, degree or time to achieve a desired position of the motor.

While the above discussion, focuses on an example of a single treatmentunit, the system may determine poses for multiple treatment units andadjust the pose of the respective unit spraying head such that each ofthe spraying heads may lock on to their respective target objects.

FIG. 24 is a block diagram illustrating an example configuration of thesystem with treatment unit 2800 configured for various fluid source andspraying tip options as well as light source and laser emitting tipoptions. In one example, the agricultural treatment system has onboardcircuitry, processors and sensors that allows the system to obtainimagery of agricultural objects and then identify a target object to besprayed. Furthermore, the agricultural treatment system has onboardcircuitry, process and sensors that allows the system to determineposition of the vehicle and/or treatment unit in a three-dimensionalspace. Moreover, the agricultural treatment system includes othercameras and computer vision sensor to obtain and process imagery ofexternal real-world objects 2884. For example, block 2850 illustrates asubsystem having a computer unit 2851, communication channel 2854,cameras 2853, machine learning model and computer vision algorithm 2855,lights 2856, and other sensors 2852. For example, the system may use GPSlocation data, IMU data to identify inertial movement and distancemoved. Over a period of time, the system may determine multiple poses ofthe vehicle and/or treatment unit and convert/translate these poses thatthe spraying head would need to be positioned into such that thespraying head would maintain an emit spray at the target object whilethe vehicle is moving.

The subsystem 2850 interacts with a treatment unit 2800. While a singletreatment unit is shown, the subsystem 2850 may interact with andcontrol multiple treatment units. Generally, the treatment unit 2800includes a microcontroller that is operably coupled with one or moresolenoids 2870, pumps, multiple motors 2820, 2830 and multiple encoders2822, 2832. The treatment unit 2800 may draw fluid from one or moresource tanks 2804. The subsystem 2850 may communication viacommunications channel 2842 with another computer system. For example,the subsystem 2850 may receive global registry information and data(e.g., global registry information such as GPS location data, IMU data,VSLAM data, etc.)

The microcontroller 2875 may control or interact with the pump, solenoid2870A, motors 2820, 2830 and encoders 2822, 2832 to position thetreatment head assembly 2860 and emit fluid from one or more fluidsources. For example, based on interaction with the subsystem 2850, thetreatment unit 2800 may control the position of a treatment headassembly 2860 to orient the treatment head assembly 2860 such that thetreatment head assembly 2860 may emit a fluid at a target object 2885.In one example, the system includes a treatment unit with a single fluidsource tank 2804A and a single solenoid 2870A, and a spraying head 2862Awith a single port.

In one example, the treatment unit 2800 can include multiple fluidsources that may be combined or mixed with a primary fluid source. Themicro controller 2875 may operate a solenoid 2870A to control the flowof a primary fluid source, such as water. The primary fluid source maythen be combined with one or more secondary fluid sources disposed nearthe treatment head. The secondary fluid sources may be concentratedchemicals or fertilizers that are mixed with the primary fluid source todilute the concentrated chemicals and create a chemical mixture as theprimary fluid source travels close to the end of the line from a tank,to the treatment head assembly 2860. While not shown, each of thesecondary fluid sources may be controlled via separate solenoids andpumps to cause the secondary fluid sources to disperse fluid from atank. The combined mixture of the primary fluid source and the one ormore secondary fluid sources are then emitted via the spraying headassembly 2860 via spraying tip 2862A with a single port.

FIG. 25 illustrates example implementations of method 2900 that may beperformed by some example systems described above. For example, in onemode of operation, at step 2910, the agricultural treatment systemdetermines a first pose of a treatment unit or vehicle. Thedetermination of the pose is described above as to FIGS. 23 and 28 . Atstep 2920, the agricultural treatment system, translates the first poseof the of the treatment unit or vehicle, and determines a first pose ofa spraying head of the treatment unit. At step 2930, the agriculturaltreatment system adjusts the spraying head position based on thedetermined first pose of the spraying head. The spraying head, forexample, may be repositioned by instructing one or more motors of theagricultural treatment system to rotate, thereby causing the sprayinghead to pivot or rotate along one or more axis. At step 2940, theagricultural treatment system controls a fluid flow regulator (such as asolenoid or another control device) to allow fluid to be emitted fromthe spraying head at a target object. The vehicle may move along a path,and while doing so the agricultural treatment system may periodicallydetermine n poses. At step 2950, the agricultural treatment systemdetermines an nth pose of the treatment unit and/or vehicle while thevehicle moves along a path and determines a nth pose for the sprayinghead. At step 2960, the agricultural treatment system adjusts thespraying head position based on the determined nth pose of the sprayinghead. Steps 2940 through 2970 may be repeated. The periodic posedetermine process allows the agricultural treatment system tocontinually adjust the spraying head while the vehicle is moving so atto maintain the emitted spray at the target object.

While the agricultural treatment system moves along a path, the systemmay continuously evaluate for additional target objects to be sprayedusing one or more treatment units. As described above, the spraying headassembly may be positioned such that a fluid may be emitted at anidentified target object. After a target object is sprayed with a fluid,the system may instruct the spraying head assembly to reposition to aready position, such as the neutral position of x=0, y=0, or at someother ready position. For example, while the vehicle is moving forward aspraying head assembly may be pointed towards the forward path ofmovement. Doing so would allow (i.e., get ready) the spraying headassembly to be in a ready position when a new target object is detected.The system may instruct the treatment unit spraying head assembly tomove into the ready position when the system is initially powered on.Moreover, the system may instruct a particular treatment unit sprayinghead assembly to move into the ready position after the spraying of athen current target object is completed. Moving the spraying headassembly into a forward ready position allows the agricultural treatmentsystem to readily start spraying subsequent target objects as soon asthey are detected without first having to move the spraying head to thetarget object.

In one example, the treatment unit 2500, may have a high-powered laserunit or laser chip embedded in or supported by the treatment unit 2500,can be configured to treat portions of plants that are larger than planttypically only grow a few inches or feet above the ground. These plantscan include trees, orchard trees, or other plants with one or moretrunks, shrubs, bushes, or other plants grown on trellises or otherhuman made mechanisms such that a horizontally or top mounted treatmentunit 2500 is more practical rather than a treatment unit substantiallypointing at the ground with rotational freedom.

While the above disclosure contemplates the control of a spraying headassembly for the emission of a projectile fluid, the spraying headassembly may be replaced with a controllable laser head assembly. Also,a laser source may be attached to the spraying head assembly. The systemmay control the positioning of a laser head assembly to position thelaser head assembly to direct an emitted laser beam at a target object.The laser beam may be used, for example, to ablate, burn or otherwisetreat the target object with a laser light beam. Additionally, differentlaser beams of different wave lengths may be configured on the laserhead assembly. The laser light may be focused to a desired diameter totreat a target object. In one embodiment, the spraying head includes aspraying nozzle and a laser emitting tip and may be disposed next toeach other such that either a laser or a spray nozzle can activate upontargeting an object of interest for treatment.

The system may treat the target object based on the identified targetobject. For example, the system may set operative parameters of thelaser to treat the target object (such as duration, frequency,wavelength, laser pulse repetition, etc.). Different target objects maybe treated with different parameters using emitted laser light from thetreatment head.

In one embodiment, the agricultural treatment system may be configuredto monitor the health of the spraying head and determine whether thespraying head is accurately emitting a fluid at a target object. In someinstances, the spraying tip may build up residue or other particulate.For example, the spraying head may disperse a fluid containing asolution of salts or of other compounds. Over time, salts or othercompounds from the solutions may build up on the outer surface of thespraying head tip and cause an emitted fluid to deviate from an intendedprojected course. In other words, the emitted fluid may miss an intendedtarget object if the emitted fluid deviates in its projected direction.

The system may correct for a deviation of the projected fluid byadjusting the spraying head to account for the deviation. As the fluidis emitted from the spraying head, an onboard camera may obtain imageryof the fluid as the fluid is emitted or projected at an intended targetobject. The system may determine whether or not the intended targetobject was actually sprayed by the emitted fluid. The system maycalculate an adjustment by determining a distance and position of wherethe emitted fluid was actually sent, and where the fluid should havelanded on the target object. The system then can determine an offset tomake a spraying head positional adjustment such that subsequent emittedfluids would land at an intended location of the target object.

In one mode, the system may continuously emit fluid in a spray or inbursts of fluid, and then determine the location of where the fluid isprojected. The system may make slight or micro adjustments to theposition of the spraying head assembly until the emitted fluid issprayed at the target object at an intended location. The positionaladjustment values then may be used as an offset for subsequent spraying.For example, an emitted spray may be spraying 1.5 inches to the left ofan intended location of a target object. The system can then move thespraying head towards the right of the target object and determine whenan emitted projectile fluid accurately hits the target object. Thisallows the system to determine what position or distance the sprayinghead needs to move to correct for spraying location error.

In one example, the system may use computer vision to track a targetobject while the vehicle is in motion. The system may evaluate imageryof a target object with onboard cameras. The system may determine theposition of features or objects in an image and evaluate the positionalchanges of pixels of the object moving in the image. The system maytranslate the pixel movement to adjustments to the spraying headassembly such that the system adjusts the spraying head assembly so thatthe treatment unit accurately emits a fluid at the target object.

This process of correcting for spraying head projectile deviation mayalso be used when a new spraying head tip is attached the spraying headassembly. This process allows for initial configuration of a treatmentunit to identify and correct for any deviation of an emitted fluid fromthe spraying tip.

FIG. 26 illustrates example implementations of method 3000 that may beperformed by some example systems described above. At step 3010, theagricultural treatment system determines a relative location of a targetobject. At step 3020, the system may emit a fluid at the target objectvia a treatment unit. At step 3030, the system monitors and tracks thefluid emitted at the target object. At step 3040, the system determineswhether the emitted fluid sprayed at the target object at an intendedlocation. If yes, then at step 3044, the system determines a relativelocation of a second target object and continues to step 3020. If no,then at step 3050, the system determines an offset for the positionand/or orientation of a spraying head. Next at step 3060, the systemdetermines a second relative location of the first target object. Thenat step 3070, the system positions and/or orients the spraying head, inpart, using the offset, to target the first target object.

For example, the agricultural treatment system may determine a firsttarget object to be sprayed. The treatment unit emits a fluid at thefirst target object via a spraying head. The system may use an onboardcomputer vision system to monitor the emitted fluid at the targetobject. The system determines whether the emitted fluid sprayed thetarget object at an intended location. The system may evaluate obtaineddigital images and identify whether or not the emitted fluid actuallysprayed the target object. The system may also determine at whatdistance and location the projectile stream deviated from the targetobject. The system may determine an offset for the position of thespraying head. For example, the system may calculate a positionaladjustment to the spraying head so that the spraying head would spraythe fluid at an intended target object. The system then may spraysubsequent target objects. The system may determine a second targetobject to be sprayed. The system may then emit a fluid at the secondtarget object via the spraying head using the offset. For example, thespraying head would be positioned in part using the determined offset.

FIG. 27 illustrates example implementations of method 2600 that may beperformed by some example systems described above. The agriculturaltreatment system may identify and determine that multiple targets are inclose proximity to one another and a particular target object can betreated while the spraying head assembly is positioned to emit fluidstoward one of the target objects. The treatment unit may be configuredsuch that system may emit a fluid from one source tank at a first targetobject, and then emit a second fluid from another source tank. At step3110, the system determines a first target object and a second targetobject for treatment. At step 3120, the system determines that the firsttarget object is a first type of a target object, and the second targetobject is of a second type of a target object. For example, the systemmay recognize the particular type of a target object using variouscomputer vision and object detection techniques. At step 3130, based onthe first determined object type, the system may treat the first targetobject with a first treatment from a first fluid source. The system maycause fluid from a first source tank to be emitted at the first targetobject. At step 3140, based on the second determined object type, treatthe second target object with a second treatment from a second fluidsource. For example, two target objects may be identified being close inproximately to one another.

FIG. 28 illustrates example implementations of method 3200 that may beperformed by some embodiments of the systems described above. Atreatment unit may pump fluid from different or multiple tank sourcesand treat a target object by emitting fluid from the different tanksources. For example, it may be desirable in some instances, to treat aparticular type of a target object, such as a bud or a flower, withfluid from multiple tank sources, whereas a further developedagricultural object, may only need treatment from one of the tanksources. At step 3210, the system may determine a first target objectfor treatment. As discussed herein, the system may identify a targetobject to be treated. At step 3220, based on a first determined targetobject, the system may select from two or more fluid sources to treatthe target object. As discussed herein, the system may include multipletanks or containers of different fluids that may be used to treat atarget object. The treatment unit may be configured to cause fluidpumped from the multiple tanks to be mixed together and emitted from asingle spraying tip port, pumped separately and emitted sequentiallyfrom one source tank and then another, or may be emitted from a tip thathas multiple spraying tip ports (e.g., a 4-port spraying tip). At step3230, the treatment unit emit a first fluid at the determined targetobject from a first fluid source. At step 3240, the treatment unit emitsa second fluid at the determined target object from the second fluidsource. The system may determine that the target object is of aparticular type of object, and the select from one or more source tanksto pump fluid and then treat the target object with the fluid.

FIG. 29 illustrates example implementations of method 3300 that may beperformed by some embodiments of the systems described above. At step3310, the system accesses an image of an agricultural scene having aplurality of objects. At step 3320, the system detects a plurality oftarget objects in the real world based on an object detection andlocalization in the image. At step, 3340 the system determines a fluidprofile for treating the first target object. At step 3350, the systemsends instructions of a first treatment parameter to a treatment unit3350. At step 3360, the system activates the treatment unit to emit afluid projectile at the first target object. At step 3370, the systemidentifies and tracks a second target object in the real world. At step3380, the system determines a second fluid profile for treating thesecond target object. As step 3390, the system sends instructions of asecond treatment parameter to the treatment unit. At step 3392, thesystem activates the treatment unit to emit a second fluid projectile atthe second target object.

FIG. 30 illustrates example implementations of method 3400 that may beperformed by some embodiments of the systems described above. At step3420, the system identifies and tracks a first target object in the realworld. At step 3430, the system determines a first desired spot size fortreating the first target object. At step 3440, the system determines afluid profile for the first desired spot size. At step 3450, the systemsends instructions of a first treatment parameter for the fluid profileto a treatment unit. At step 3460, the system determines a first fluidpressure against the treatment unit. At step 3470, the system sendsinstructions based on the first pressure to activate a solenoid to allowrelease of a first fluid projectile. At step 3480, the system orientstreatment unit activate the solenoid. At step 3490, the systemdetermines a spray profile associated with the first fluid projectile.The following further describes these operations.

As discussed above, the system may use a pump to create pressurizedfluid from the pump to a solenoid. The system may send specific voltageand pressure instructions to the solenoid such that, accounting for thedistance between the turret nozzle to the surface of the target, a ⅛inch to 5 inch diameter of the spot of the spray can hit the target.Moreover, the system may variably and incrementally change the liquidprojectile for every spray. The system may utilize a predetermined basepressure from the pump to the solenoid, and then open and close thesolenoid by providing voltage instructions (for example, 24 or 48volts). The spot of the spray of the emitted projectile may becontrolled by the system to achieve a desired spray amount and spraydiameter to cover an area of a target object.

While the solenoid is completely closed, the fluid may be pressurized toa particular psi. For example, a pump may operate to pressurize thefluid in the range of 1-200 psi. A working line psi from the pump to thesolenoid may be about 60 psi when the solenoid is completely closed. Anemission tubing from the solenoid to the nozzle and/or spray tip wouldhave a psi less than the working line psi when the solenoid is closed.The opening of the solenoid releases the pressurized fluid from theworking line and causes the fluid to fill into the emission line and outthrough the nozzle and/or spray tip. Over a period of time, the emissionline may build up pressure and cause the pressurized fluid to emitthrough the nozzle and/or spray tip. By quickly opening and closing thesolenoid, the system may emit intermittent bursts of the fluid from theworking line causing the fluid to emit as a projectile via the nozzleand/or spray tip.

As fluid leaves the pressurized working line behind the solenoid, thepressurized working line behind the solenoid (from the pump to thesolenoid) loses a small, but negligible amount of pressure. As more andmore fluid leaves the working line (for example, 100 bursts or shots),the overall drop in pressure becomes nontrivial. In certain situations,the pump may not be able to compensate for the drop in pressure until apressure drop becomes significant enough. For example, the pressure maybe 60 psi up to the wall of the solenoid. As the treatment unit emits100 bursts of fluid projectiles, the pressure behind the solenoid mayincrementally drop, for example, to 40 psi. At 40 psi, a pressure sensormay inform the pump and solenoid to accommodate for the pressure drop,and the system will increase pressure back to 60 (and in some instancesmaybe increase the psi in the working line to more than 60 psi toquickly reach a base 60 psi).

A situation may occur in that dropping the pressure in the working linefrom 60 psi to the 40 psi, and then suddenly increasing back to 60 psi,incremental bursts (e.g., shots) may be emitted at slightly differentstarting pressures. So if the system opens the solenoid at the same rateand same voltage every time, the solenoid actually may open at differentrates (because the pressure pushing at the wall was different), soemitted fluid projectiles are incrementally different. To account forthe variability of the pressure in the working line, the system mayaccount for the incremental drop in pressure due. The system maygenerate instructions to change voltages sent to the solenoid so tomaintain the same droplet size and/or fluid volume. The amount by whichthe solenoid opens each time may be slightly different so as to maintainthe same trajectory, volume and/or droplet for the size emitted fluidprojectile (thereby accounted for the differences in the psi for eachburst or shot of the fluid because the pressure pushing at the wall isdifferent). The system may be calibrated and/or configured to open andclose the solenoid at different time intervals and different timedurations such that the amount of pressure is the same for each shot.Alternatively, the system may be calibrated to open and close atdifferent time intervals and for different durations such that theamount of pressure is similar. The system may use a spray profile thatcontrols the timing and duration of the opening and closing of thesolenoid.

For example, the system may receive one or more images of anagricultural scene having one or more agricultural objects, such asplant object. The system may then detect a plurality of target objectsbased on the received one or more images. The system may identify afirst target object in the real world from the detecting of theplurality of target objects in the received one or more images. Thesystem may determine a set of first treatment parameters (e.g., sprayprofile) for the first target object. Based on the first set oftreatment parameters, the system may instruct a treatment unit to emit afluid projectile at the first target object. The first set of treatmentparameters may include one or more of: a spray speed for an emittedfluid, a spray size for an emitted fluid, a spat profile and/or a sprayduration. Based on the spray profile, the system controls a solenoid torelease a pressurized fluid that has been pumped from a fluid source,the emitted fluid projectile that includes a portion of the releasedpressurized fluid. The system may adjust the opening and closing of thesolenoid to account for hysteresis band of a pressure drop in thepressurized fluid.

The system may continue to treat multiple other target objects, such asagricultural objects that are plant objects. The system may identify asecond target object in the real world in the received one or moreimages. The identification of the other target objects may occur in realtime concurrently or after the first target object is detected. Thesystem may determine a set of second treatment parameters (e.g., a sprayprofile) for a second target object. The set of second treatmentparameters may bet the same or different from the first treatmentparameters for the first target object. For example, the system maydetermine a second spray size for the second target object that isdifferent from a spray size for the first target object. The system maydetermine a second spray speed for the second target object that isdifferent from a spray speed for the first target object. The system maydetermine a second spray volume for the second target object that isdifferent from a spray volume for the first target object. Also, thesystem may determine a target splat size for the second target objectthat is different from a target splat size for the first target object.For example, a splat size of a same about of a drop of fluid may have adifferent impact shape when it impacts a surface. The impact shape maybe changed based on the trajectory speed of the drop of fluid when itimpacts the surface of the target.

After the second set of parameters are determined, then the system mayactivate the treatment unit to emit a second fluid projectile at thesecond target object based on the second treatment parameters. Thesystem may confirm whether the second fluid projectile contacted thesecond target object (e.g., based on sensor data comprising digitalimagery, lidar data, sonar data, radar data or a combination thereof.).In one example, the second target object can be the same target objectas that of the first target object, such that the system is performingtwo treatment actions with two different treatment parameters, forexample the chemical type or composition or state, onto at least aportion of a surface of the same object.

While the foregoing describes examples of the pressurized fluid spraywith particular pressure and spray profiles, the system may beconfigured within various ranges. For example, the emitted spray may beemitted from about 1 millisecond to 1 second between the range of 40 to80 psi. The system may be capable of generating a pressure of about 1psi to 2700 psi in the working line. The volume of fluid released andemitted may be from about 1 micrometer to 1000 milliliters. The targetsplat diameter can be about a dime size at 40 psi to about a quartersize at 60 psi. The projectile/droplet size may be about 1 millimeter toabout 100 millimeters diameter in a single drop (not volumetric).

In some other examples described below, some embodiments are implementedby a computer system 70000. One example is depicted in FIG. 37 . Acomputer system 70000 may include a processor 70200, a memory, which maybe optional and is omitted from FIG. 37 , and/or a non-transitorycomputer-readable medium. The memory and non-transitory medium may storeinstructions for performing methods and steps described herein. Thecomputer system 70000 may further include one or more interfaces 70400.These interfaces may be used for receiving data and images, sending dataand images or interacting with a human operator via a man machineinterface such as a keyboard and a display. Various examples andembodiments described below relate generally to robotics, autonomousdriving systems, and autonomous agricultural application systems, suchas an autonomous agricultural observation and treatment system,utilizing computer software and systems, computer vision and automationto autonomously identify an agricultural object including any and allunique growth stages of agricultural objects identified, including cropsor other plants or portions of a plant, characteristics and objects of ascene or geographic boundary, environment characteristics, or acombination thereof.

Additionally, the systems, robots, computer software and systems,applications using computer vision and automation, or a combinationthereof, can be configured observe a geographic boundary having one ormore plants growing agricultural objects identified as potential crops,detect specific agricultural objects to each individual plant andportions of the plant, determine that one or more specific individualagricultural object in the real world geographic boundary requires atreatment based on its growth stage and treatment history from previousobservations and treatment, and to deliver a specific treatment to eachof the desired agricultural objects, among other objects. Generally, thecomputer system provides computer vision functionality usingstereoscopic digital cameras and performs object detection andclassification and apply a chemical treatment to target objects that arepotential crops via an integrated onboard observation and treatmentsystem. The system utilizes one or more image sensors, includingstereoscopic cameras to obtain digital imagery, including 3D imagery ofan agricultural scene such as a tree in an orchard or a row of plants ona farm while the system moves along a path near the crops. Onboardlights sources, such as LEDs, may be used by the system to provide aconsistent level of illumination of the crops while imagery of the cropsis being obtained by the image sensors. The system can then identify andrecognize different types of objects in the imagery. Based on detectedtypes of objects in the digital imagery, or the same object from onemoment in time to another moment in time experiencing a different growthstage which can be recognized, observed, and identified by the onsystem, as well as the system associating the growth stage or thedifferent label with a unique individual agricultural object previouslyidentified and located at previous growth stage, the system can apply atreatment, for example spray the real-world object with chemicals pumpedfrom one or more liquid tanks, onto a surface of the agriculturalobject. The system may optionally use one or more additional imagesensors to record the treatment, as a projectile, as it is applied fromthe system to the agricultural object in proximity to the system.

1. Additional Embodiment Scenarios

Agricultural industry is looking for new ways to keep up withever-increasing demand. One possible way is to use more land forfarming. However, growth in area of arable land faces other competingobstacles from real estate industry, ease of connectivity with urbanareas and local regulations. Therefore, increasing productivity ofexisting farmland is an attractive option.

The above-described techniques can be used in an agricultural setting toimprove crop yield. Some additional embodiments are disclosed next. Theterm crop used herein may refer to fruits, vegetables, grains, and otheragricultural products that are used for direct or indirect human oranimal consumption. In one example aspect, the disclosed techniques maybe used to detect undesirable growth in a farm that competes withdesirable crop for water and nutrients. Such undesirable growth mayinclude, for example, weeds or other plants that are not intended to begrown on the farm. Accordingly, weeds may be detected and eliminatedusing the disclosed techniques. Unless otherwise mentioned, the term“farm” is used herein as a short reference to various flora growingfacilities such as crop farms, orchards, green houses, gardens, and soon.

2. Example System Overview

FIG. 31 shows an example of a setup 10000 that may be used forimplementing some of the techniques disclosed in the present document.The setup 10000 includes an onsite platform 10400 and offsite computingresources 10200. The setup 10000 may further include a target 10400. Forexample, the onsite platform 10400 may be a computer system and may bemounted on an agricultural vehicle that may be configured to operate ina farm or orchard. The example agricultural observation and treatmentsystems, including portions or the systems or component treatmentmodules, systems, and/or subsystems, whether performance of analysis,ingesting sensor readings, treatment actions are performed on computeunits onboard or onsite a physical moving platform, or distributed atservers including cloud servers or edge computer devices, or acombination thereof, that are discussed above can be broadly interpretedand implemented in the following discussions below, particularly withdiscussions related to onsite platform 104 and functionalities andactions performed by onsite platform 104. In some embodiments, onsiteplatform 104 may be modular—e.g., comprising multiple modules thatoperate relatively independent of each other and are configured toimplement various functionalities described in the present document.Each of the module may comprise hardware that includes a processor, amemory and may further include hardware such as cameras and/or sensorsfor estimating a local pose of the module, a treatment applicationmechanism such as a spray turret or a laser source. The offsitecomputing resources 10200 may be located at a suitable location. Forexample, in some embodiments, the offsite computing resources 10200 maybe located in close proximity of the onsite platform 10400 (e.g., on ashared platform or within 10 to 100 meters) configured to provide andperform edge computing resources in real time in parallel with actionsperformed by the onsite platform 10400. In some embodiments, the offsitecomputing resources 10200 may be located or distributed in a remotelocation such as a data or control center, a cloud computing facility,and so on. The offsite computing resources 10200 may be eitherco-located or distributed across various locations. The variouscomponents of the setup 10000 may be implemented using otherconfigurations disclosed in the present document, e.g., FIG. 2 , FIG. 4and FIG. 8 .

In various embodiments, the target area 10400 may include one or moretargets items 10600, listed as item 1 to N in FIG. 31 . The target areamay be, for example, a rectangular portion of the ground being scannedby the onsite platform 10400. The one or more target items 10600 mayinclude desirable vegetation (e.g., crop being grown) and/or undesirablevegetation (e.g., weeds or other growth that is not intended to befarmed) and/or other miscellaneous items such as trash, pebbles, soilclumps, and so on. The present document further describes how the targetarea is determined and processed for the treatment operation.

The communication link 10800, further described throughout the presentdocument, may be a wired or wireless link carrying data and control fromthe offsite computing resources 102 to the onsite platform 10400. Thecommunication link 10800 may carry ML information, control data, and soon.

The communication link 11000 from the onsite platform 10400 to theoffsite computing resources 10200 may be a wired or wirelesscommunication link. The communication link 11000 may carry informationthat includes images, results of operation of the onsite platform 10400,diagnostic information, map data, and so on, as further disclosed in thepresent document.

The offsite computing resources 10200 may further receive inputs such asraw image data or training image data 11800, as is further described inthe present document.

The onsite platform 10400 may receive input 11600 representing signalsand information locally collected using various sensors such as cameraand LiDAR (light detection and ranging) for performing functions such asimage capture and pose detection, as is further described in the presentdocument.

The link 11200 represents emanations from the onsite platform 10400 thatmay reach the target area 12000, such as a sprayed chemical, depositedsubstance such as fertilizer, or a laser beam.

The link 11400 from the target area 12000 to the onsite platform 10400represents images or other sensor readings captured by the onsiteplatform 10400 both prior to treatment operation and during and afterthe treatment operation.

3. Onsite System Embodiments

FIG. 32A shows one example implementation of the onsite platform 10400.The onsite platform 10400 may include a real-time processing engine20000. The real-time processing engine 20000 may analyze acquiredimages, as is further described in the present document. In someembodiments, the onsite platform 10400 may be configured to implement MLsupport functionalities 20200, as is further described throughout thepresent document. In some embodiments, the onsite platform 10400 mayinclude one or more sensors 20400 to acquire various information of theenvironment in which the online platform 10400 operates. Some examplesof the sensors are depicted in FIG. 32D and include pose sensors 24000,light sensors 24200 and other sensors 24400 such as humidity sensors. Insome embodiments, additional input/output modules 20600 may be included.Examples of such modules (see FIG. 32E) include ejectors 25000 which maybe laser emitters or liquid spraying turrets and further described inthe present document, cameras 25200 and other data communicationinterfaces 25400 including wireless or wired data communicationinterfaces. Using the sensors 20400 and/or the I/O 20600, the onsiteplatform 10400 may acquire various inputs 11600, as further described inthe present document.

The onsite platform 10400 may include a number of treatment containersand treatment nozzles or a number of laser sources. A typical number maybe between 2 to 6 spraying or laser mechanisms, each with one to 6individual treatment units that can be configured to target, aim, track,and emit a treatment on to a specific small stationary target in thegeographic boundary, such as an agricultural environment, on a movingvehicle. The onsite platform 10400 may also continuously track the poseof each of the treatment mechanisms such as the spraying/lasermechanisms in terms of their availability and a direction to which thesemechanisms are pointing and will eject.

FIG. 32B shows a workflow that may be implemented by a system such asthe onsite platform 10400 on an agricultural vehicle. At 21200, thesystem may turn on and go through a self-calibration to ensure that thesystem achieves normal, or expected, operational conditions. In someembodiments, the calibration operation may be used to ensure that eachposition, orientation, and what each sensors are seeing are inagreement. This includes sensor calibration, fusion calibration, clockcalibration, accounting for latency, and so on. One such technique iscalled projection and reprojection of stereo cameras. For example, asillustrated in the flowchart of FIG. 43 , a method 130000 forcalibration may be as follows. At 130200, two cameras may be used toobtain a stereo image. Based on the camera separation and an estimatednominal distance at which an object X in the image is, a firstdetermination (130400) may be performed using a frame captured from afirst camera, wherein the first determination estimates where in asecond image captured from the second camera should the object appear.Similarly, a second determination (130600) may be made based on thesecond image captured from the second camera about where in the firstimage captured by the first camera should the object X occur.Accordingly, one implementation of calibration may include a detectingoperation in which the Object X is detected, followed by two estimatedposition determination operations, one on each image of a stereo imagepair, followed by an adjustment to the system (e.g., 22500 in FIG. 32B)based on mismatch found in the determination operations. The adjustmentmay be made in the analog domain (e.g., by moving camera positions forcorrect alignment) or in the digital domain (e.g., by correctingfractional pixel offsets between left and right images in the stereopair).

Alternatively, or additionally, a stereo matching technique may be usedto combine images obtained from the cameras to generate ingest images.It will be appreciated that such a calibration operation is able toverify that a detected object is actually present in the physical worldand thus be submitted (130800) for subsequent ML based detection andtreatment, as described in the present document.

In some embodiments, calibration can be done by scanning a known objectof known size and known distance, like that of a barcode or anotherknown pattern.

FIG. 39 shows an example of a starting configuration of the onsiteplatform 10400 showing a number of ML algorithms (from 1 to N) loaded onthe onsite platform 10400, zero or more computer vision (CV) algorithmsloaded on the onsite platform 10400 and a map of area in which theagricultural vehicle is operating. Upon startup, the system may performan optional calibration process to calibrate certain operationalparameters such as accuracy of pose sensing or image capture equipment.One of the tasks performed at 21200, may be determination of a pose ofthe system. The pose may refer to the physical location and/or bearingof the vehicle.

At 21400, the vehicle may ingest images. Various embodiments foringesting one or more images are described in the present document.Ingested images may be of different types. One type of ingested imagesmay be used for operational reasons such as pose estimation, mapbuilding, etc. Another type of images may be ingested for use in thedetermination of treatment targets by searching for objects and theirpositions from the ingested images.

In some embodiments, the vehicle may continually monitor its posethroughout the described workflow. In some embodiments, the agriculturalvehicle may carry two groups of sensors. One group of sensors may beused for global registry and the other group of sensors may be used forlocal registry. Furthermore, each group also can serve TWO purposes.Accordingly, there may be four different outcomes or benefits due tosuch an arrangement. In some embodiments, the first group of sensors maybe fitted on the agricultural vehicle (for global positioning estimate)and a second group of sensors may be positions on individual modules ofthe onsite platform 10400 (for local pose estimation) as furtherdescribed below.

In some embodiments, the Group 1 sensors (24100) gather sensor readingsin real time for vehicle pose in real time. Further, Group 1 sensors maygather or use the sensor readings to build a global map in real timeand/or offline.

In some embodiments, Group 2 sensors (24300) gather sensor readings inreal time for object detection (e.g., as discussed with respect toingest images 21400) for real time treatment. Furthermore, Group 2sensors may gather sensor readings and tracks objects, for examplewithout even knowing what they are, just that they are definitelyobjects) to get pose of the local sensors.

One advantage of this arrangement is that the onsite platform 10400 canmap specific plant objects (because of the ingested images) andintegrate that into a global map (the robot knows where each ingestedimages are in the greater scheme from the global map). In someembodiments, a comprehensive high definition (HD) map may be generated.This HD map may be zoomable and may have associated specific informationwith specific portions of the map indicating a location, a treatmenthistory, and what the object looks like, and so on.

Another advantage of this arrangement is that, in real time, from thetreatment turret's perspective, an accurate reading may be obtainedbecause the sprayer now knows it's local pose from the sensorsphysically close to it, as well as the vehicle pose because of the“macro sensors” visual simultaneous localization and mapping VSLAMcameras, global positioning system GPS, inertial measurement unit IMU,etc. (the sprayer has encoders to even more bolster the accuracy ofwhere it is). This superposition of all these real time readings to giveus pose just makes the targeting and tracking more accurate. At 21600,the ingested image is analyzed to discern one or more targets or objectsin the image.

At 21800, a confirmation is made regarding whether or not the objectsdetected in the previous step should indeed be considered for beingtarget for treatment. The confirmation may be made based on—what type ofobject was detected and whether the detection was robust.

Using the confirmation, a decision may be made regarding whether anejector or a spray is to be activated for treatment. As furtherdescribed in the present document, the decision may be made by analyzingthe images using a ML algorithm that is trained to identify a particularcrop or a particular weed. Some of the tasks performed for target areaprocessing are depicted in FIG. 34 and include object detection 40200(e.g., outcome of 21600), object verification 40400, object tracking40600, occlusion detection 40800, and ejector control 41000, all ofwhich are further described below.

Object Verification 40400

Based on the identification, either a rule of exclusion or a rule ofinclusion may be used to identify the targets for a next action. Therule of exclusion may, for example, label certain identified objects tobe excluded from the next step of operation. These objects may be, forexample, a crop or a fruit or a vegetation that is intended to be grownin a field. The rule of inclusion may, for example, label certainidentified objects to be included in the next step of operation.Examples of such objects may include weeds, grass or other undesirablegrowth identified in the image. In some embodiments, object verification40400 may be used to control treatment application mechanism such thatif a target object is within a close proximity (e.g., within a thresholddistance) of another object that is deemed to be a high value object,then the treatment application mechanism may be controlled to mitigateany adverse impact to the high value objects. The controlling may bedone by reducing an amount of ejection or a duration of ejection or bypointing the treatment tip to a location that is farthest possibleposition within the target object and the high value object that is notto be disturbed by application of treatment.

In various embodiments, the images ingested at 21400 may comprise ofpoint cloud data, fused data points such as point cloud data in syncwith image data, sonar, radar, etc. As further described in the presentdocument, these images may be analyzed to identify objects such asflowers, weeds, fruits, objects, etc. As previously described, in someembodiments, a type of ingested images may be used to estimate pose andvisual inertial state estimation VIO or for simultaneous location andmapping, SLAM.

The ingested and processed images may also include any salient points totrack. For example, the points can just be interesting groups of pixelsthat likely map to a real-world object or pattern. It does not need tohave a direct correspondence to a real object with real meaning. Forexample, a cluster of “corners” or “lines” can be detected (with or withML) and tracked so the sensor sensing it can determine where it isrelative to the stationary “corner”. Particularly for agriculture, thiscan include real world defined objects like tree trunks, beds, troughs,rocks, ditches, gravel with specific shapes/patters, poles, irrigationsystems. Such objects may be called landmarks within the collectedvisual imagery.

Object Tracking 40600

Based on the identified targets, the real-time processing engine 20000may then proceed to perform, at 22000, object tracking. The operationmay include, e.g., preparing the onsite platform 20400 for treatment onone or more of the identified targets. The real-time processing enginemay need to take several operational factors into account in issuing acommand to adjust the treatment mechanism such as the spray turret orthe laser source to eject an appropriate amount or bolus of liquid orlight for an appropriate duration in an appropriate direction with anappropriate force.

For example, the real-time processing engine 20000 may determine a timeinterval between a first-time instance at which an image that contains aparticular treatment candidate object was captured and a time at whichthe real-time processing engine 20000 will deliver the liquid to hit theobject. This time interval will depend on several factors that include:(1) image capture delay, (2) image pre-processing delay, (3) MLalgorithm execution delay to identify the object, (3) computationaldelay form confirming the object (step 21800), (4) computational time tomake a decision to treat and issue a command to the ejector mechanism(5) inertial delay or physical reaction time to prepare the ejectormechanism to shoot in an appropriate orientation pointing to the futureposition where the target object will be when the sprayed liquid ortreatment substance reaches the object. To be able to take into accountdelays such as the inertial delay, the real-time processing engine 20000may constantly be aware of its pose, the orientation of the ejectormechanism. The real-time processing engine 20000 will also estimate aspeed of relative movement between the agricultural vehicle and theonsite platform 10400 and the target object to predict a position of theobject when the sprayed liquid reaches the target object.

For predicting future position of a target object, various imageprocessing or computer vision algorithms may be used. For example, insome embodiments, an optical flow of the target object may bedetermined. The optical flow may be determined using a number of“control points” that define the object (typically 4 to 6 controlpoints) and relative movement of the control points between successiveframes. Based on the assumption that rigid object move smoothly, theoptical flow may be used to predict position of the object at somenumber of frame times in the future. In some embodiments, a Lucas Kanadetracking method may be used to track small objects that moveincrementally relative to a moving vehicle. Performing ML detection onevery frame may at times be resource expensive, so tracking a cluster ofpoints from an object that is identified frame 1 into frame 2 and 3 ismuch easier also likely accurate. Tracking objects also makes it similarto send instructions to a treatment sprayer to emit a projectile at theobject detected and localized in frame 1 while still moving. Forexample, the ML detector only needs to detect on frame 1 and the trackeralgorithm, which is computationally less expensive than the MLdetection, can track the object into frame X and cause to sendinstruction to treat the object. FIG. 44 shows an example workflow140000 for such a procedure which includes running an ML detectionalgorithm on a first frame (140200) that results in identification of anobject in the first frame; turning off the ML algorithm for N nextframes after the first frame while turning on a tracking algorithm forthe N frames to track the object (140400) and directing a treatmentejector to treat the object at a target frame after the N frames basedon a targeted position estimate of the object in the target frame(140600). The number N may be 3 to 4 frames, and may depend on extrinsicfactors such as speed of the agricultural vehicle, environmental factorsand intrinsic factors such as a confidence level of the ML detection andthe tracking algorithm.

Occlusion Detection 40800

In some embodiments, the objects in the ingested images may be trackedto check for occlusions. For example, if a tracked (for discussionrelated to Occlusion Detection, tracking an object may refer todetecting objects using ML or CV detector in multiple consecutive framessuch that the ML or CV detector is analyzing each frame for the sameobject) object A is seen in frames 1 and 2, but is not seen in frame 3,a determination may be made regarding whether this has occurred due tothe object A being falsely identified in the first two frames, orbecause the object is occluded behind another object B in frame 3.Occlusion detection logic may check for whether the object A re-appearedin subsequent frames 4 or later when camera angle relative to the objectA changes. Alternatively, or in addition, the occlusion detection logicmay track optical trajectories of various objects with respect to theirrelative positions from the camera to resolve a situation of whetherobject A has disappeared from an image due to a possible occlusion ordue to another error. With respect to a decision regarding whether ornot an object that was occluded is to be treated, various strategies maybe used. In some implementations, a determination may be made regardingwhether an object that is a target for treatment (e.g., weed) may becomeoccluded at the future time instant at which the bolus of herbicide isexpected to hit the object. If the answer is yes, then the object maynot be treated (e.g., sprayed upon). In some implementations, objectsthat “disappear” from certain intervening images may still treated incase that it is determined that the object disappeared from one or moreimages due to an expected occlusion. In one example, tracking objectsacross frames can be performed by applying one or more warping functionsto each subsequent frame such that comparing a first frame to subsequentframes of captures images can be performed more accurately, particularlyon a moving vehicle, such that deviations such as a change in field ofview from frame to frame can be accounted for

FIG. 41 is an example of a method 110000 for performing occlusiondetection of M objects, where M is an integer greater than or equalto 1. The method 110000 may be implementing while performing theocclusion detection 40800. At 110200, M objects may be tracked in theingested image sequence. In some embodiments, the tracking may beperformed after ML algorithm has detected the objects in the ingestedimages using techniques described in the present document. Here, thenumber M may be a positive integer, with typically values of between 2to 20 objects, but also be detected up to hundreds of objects per frame.Some examples are disclosed with respect to operation 40600 herein. At110400, it is determined that an object that was present in a previousimage is not present in a current image. Upon determining that theobject is not visible in the current frame, a determination is maderegarding occlusion status of the object. In some embodiments, theocclusion status of a single object from M objects may be made bycomparing a projected trajectory of the object with projectedtrajectories of M-1 remaining objects to for intersection along an axisbetween a location of the camera and a predicted location of the objectin the current frame. In case that the occlusion status is determined tobe “yes”, then the current frame is marked as an occlusion frame for theobject and next frame processing is continued. In case that theocclusion status is determined to be “no” then it is determined that theobject has moved out of the active area of current image capture andmarked accordingly. In case that the object has been determined ashaving moved out of active area of imaging, the object may be removedfrom the list of target objects for treatment.

Ejector Control 41000

In some embodiments, the real-time processing engine 20000 may determinean appropriate amount of liquid to be ejected towards the targetobject(s). In some embodiments, depending on the type of weeds that willbe encountered in a field, a particular type of herbicide might beloaded into the spray turrets. Furthermore, a nominal value of herbicidebolus may be pre-specified for the type of weeds. During operation, thereal-time processing engine 20000 may determine a delta to be added tothis nominal value or a delta to be subtracted from the nominal valuewhen shooting at the target objects. One factor that may add deltaamount to the nominal value is that the weed may have been expected tobe eliminated based on previous runs and previous maps, but has not beeneffectively killed. Another factor to add delta amount may be that thedetected object (e.g., the borders or the bounding box of the object)may be larger than a typical value for the weed (e.g., 20% greater areathan the typical value). Another factor to add delta amount may berelated to environmental factors such as rain or wind. For example, ifthe ambient situation is rainy or windy, a larger bolus may be deliveredto ensure that an effective amount of liquid lands on the target object.Conversely, factors that may be used to reduce the nominal amount by adelta could be that the detected object is smaller in size than thetypical size, or that the detected object is within a physical distanceof a desirable object. Another factor may be that there are numeroustarget object close to each other and therefore a smaller dose for eachtarget object is acceptable.

As disclosed in Section 10, in some cases the ejector may be controlledto aim at a specific critical point of the target object (e.g., root,stem or leaves). Alternatively, in some cases, a volumetric approach inwhich the entire target object is treated may be used. Thisdetermination may be made using pre-determined rules that depend on, forexample, the type of weed, the chemical formula of the herbicide beingused, the level of weed control that is desired during the run, and soon.

At 22200, the real-time processing engine 20000 may activate a mechanismto eject a chemical through one or more nozzles by pointing the one ormore nozzles in directions of the identified targets. The real-timeprocessing engine 20000 may activate a mechanism to emit a light pulsethrough one or more emitter head units by configuring the emitter headof an emitter unit to point to a target object in real time on aphysically moving platform, the light pulse configured to burn a portionof a plant.

During and after the treatment operation 22200, the real-time processingengine 20000 may gather feedback, at 22400, from the target area. Thefeedback may be used, for example, to verify whether the treatmentoccurred and an amount of treatment and a post-treatment image of theone or more identified targets. In some embodiments, the feedback may belive or in real time. This may allow the system to determine if therewas a miss in the treatment application. The system may then attempt tofigure out whether the mark was missed due to a calculation error, aphysical spray nozzle problem, wind, etc. For example, if salt orresidue builds up in the sprayer, even though the sprayer has the lineof sight correctly, the residual built up could perturb the trajectoryof the desired spray, which may need to run a cleaning routine or adjustthe pressure of ejection during treatment. and in real time or on alater run, remember to adjust and re-treat the plant. In someembodiments, after a run is complete, the data is uploaded to the cloudto remap the entire geographic boundary (a farm) with updatedimage/views/2d/3d models of every plant and its location, it has itstreatment history attached, it's phenology logged (if it's any differentthis time) including for example stage of growth, etc.

Alternatively, in some embodiments another weed-removing mechanism maybe activated. For example, in some embodiments, a knife or a scissorsmay be controlled to cut into the area of interest. In some embodiments,scissors may be used to cut a growth in an area of interest.

Additional examples of this operation are disclosed throughout thepresent document, including Sections 6 and 9 of the present document.

4. Examples of Agricultural Vehicle

An agricultural vehicle may be a vehicle that is deployable on unpavedfarm ground. This vehicle itself can be navigated autonomously, as aself-driving vehicle. For autonomous navigation, the vehicle may use theinput 11600 and the pose estimation functionality 22800 described in thepresent document. Sensory inputs may be processed to obtain a pose ofthe vehicle. The pose may be used to determine a next location or a nextpose to which the vehicle proceed. The decision may be used to controlactuators connected to motion control.

In some embodiments, the agricultural vehicle may be a tractor with anattachment that stations the onsite platform 10400. As disclosed inSection 2, in some embodiments, the onsite platform 10400 may includeone or more modules equipped with cameras and the treatment mechanismsuch as a spray nozzle pointing towards ground or a side and a number ofadditional sensors to be able to figure out the global and local posesof the onsite platform 10400.

5. Examples of Pose Estimation

In some embodiments, the method 60000 (refer to FIG. 36 ) includedetermining a pose of the agricultural vehicle. In some embodiments, thepose is continually determined. In some embodiments, the pose isdetermined using sensory inputs including one or more of a globalpositioning system input, an inertial measurement unit, a visual sensor,or a radar sensor, sonar, LiDAR, RGB-D camera, infrared, multispectral,optoelectrical sensors, encoders, and so on.

In some embodiments, the pose is determined using multiple sensed and/orcalculated points. For example, a GPS measurement may be combinedtogether with a pre-loaded map of an area to determine the pose. In onebeneficial aspect, the combination of multiple sensory readings allowsto compensate for errors in each individual sensor measurement.

In some embodiments, visual simultaneous localization and mapping(vSLAM) or SLAM may be used. Additionally, or alternatively, visualodometry (VO) or visual inertial odometry (VIO) may be used. In variousembodiments, VSLAM may be performed with local sensors and/or withglobal sensors. In some embodiments, VSLAM may be done with keypointdetection, key cluster detection and so on. Additional features of poseestimation include one or more of: frame to frame detection, localbundle adjustment, object to object detection, known object to knownobject based, using ML detected object, just for tracking, etc. In someembodiments, pose detection may be achieved by matching frames in time,matching frames in stereo, or both. In some embodiments, pose estimationmay use line detection, corner detection, blob detection, etc. to buildHD maps online and offline, and get pose in real time.

In some embodiments, pose of the onsite platform 20400 may berepresented using a relative term relating to the location andorientation (6 total degrees of freedom) from some base location.

Pose may refer to a location and orientation of an object relative to aframe of reference (x,y,z, phi, theta, psi (for example)). For example,a “Global Frame of reference” may be defined as a corner of a farm (or abarn, or a 5G/wifi/GPS tower on the farm) as (0, 0, 0, 0, 0, 0). Thepose of the vehicle would be (x1, y1, z1, phi1, theta1, psi1) that mayget checked frequently (e.g., 200-5000 times a second). Then the sprayercan also have a pose relative to the vehicle. Then the sprayer head(because it itself is a gimbal) itself can also have a pose relative tothe sprayer or to the vehicle, or directly to the “global frame”. Poseestimation may involve figuring out all of these poses to get finalreading. Meaning a final (x, y, z, rho, theta, psi) of the treatmenthead assembly at t=1 can be relative to the body of the sprayer,relative to the vehicle, relative to some (0, 0, 0, 0, 0, 0) location ofthe farm. Alternatively, or in addition, the pose can be relative to thebody, the body then relative to the vehicle, the vehicle then relativeto the farm.

Therefore, a “pose” of each object (e.g., a fruit, a flower, a weed,etc.) can be the location of it relative to the camera, or the corner ofthe farm. Therefore, pose of an object may be considered to be alocalization of the object in some geographic area/boundary (globalmap/submap).

In some embodiments, pose for the vehicle is caught by fusing two ormore of: 1. cameras (not the cameras looking for flowers) looking at theglobal world (the farm), 2. GPS reading, 3. Input from IMUs, 4. wheelencoders, 5. steering wheel encoders, 6. LIDAR readings 7. Radarreading. 8 Sonar. 9. as many other sensors to get position of thevehicle. As mentioned before, these can be flashing at 24-5000 hz forexample and feeding that information to each treatment mechanismcontrolled by the onsite platform 104 at any speed.

2. In some embodiments, the pose for the treatment mechanism may beassumed to be same as that of the agricultural vehicle because it may befixed on the vehicle. In some embodiments, the turret of the treatmentmechanism may be fixed to a platform that holds a controller thatcontrols the treatment mechanism, called a spraybox computer, which mayalso be fixed onto the vehicle. Therefore, a change in pose of thesprayer is just the same as the change in pose of spraybox. Furthermore,pose of both these may change with changes to pose of the vehicle whenthey are fixed to the vehicle. Each spraybox's computer and sprayer'smicrocontroller will work to keep track of pose. The spraybox computerwill constantly ask the vehicle for what the pose is so it will know itsown pose, and then let the microcontroller of the sprayer know. At thispoint a particular spraybox may only care about its own pose, and notthe pose of other sprayboxes. The other sprayboxes have their owncomputer to determine their own pose.

3. Pose for the treatment mechanism and its discharge nozzle is caughtby the spraybox computer by asking the vehicle for pose, but also askingthe microcontroller of the sprayer itself what the encoders are sayingthe motors are doing, and what the motors did last. The sprayboxcomputer may combine the two together to determine where the dischargenozzle's pose in the world at the current time, and adjust fortreatment.

The use of global pose and local pose may be highlighted using thefollowing use cases. In some embodiments, the onsite platform 10400 maysimply be looking for weeds to kill and crops to not kill while it'skilling weeds. There may be no tracking about where the weed is in theworld. If the same weed is seen during another run (e.g., a monthlater), that dead weed may be the same weed that was identified a monthago and indexed. However, the onsite platform 10400 may not care (due tolack of global pose information) and just shoot at weeds based on localpose information.

In some embodiments, the onsite platform 10400 may be configured forfinding a weed to kill. It may look for a specific object, e.g., weed#1284 on farm column #32 on farm X, that was previously detected, andmatch it with previous detection and treat in in real time.

The first situation does not require any global registry of pose. Aslong as the spraybox identifies a weed and knows where the weed itrelative to the spraybox, it will just treat the object. It doesn'tmatter if the spraybox knows where the spraybox is in the world, or theweed. The onsite platform 10400 may just use local SLAM to track theweed, or other pose estimation techniques to find and track plantsrelative to the spraybox. The second situation requires mapping theworld as well as finding the weed and indexing the state/history of eachweed.

6. Treatment Verification

In some embodiments, the method 60000 includes capturing, using anauxiliary camera system 24200 on the onsite platform 104, one or moreimages of operation of the ejector ejecting towards the region ofinterest. In some embodiments, the auxiliary camera system is configuredto operate at a second frame capture rate that is more than the firstframe capture rate.

In some embodiments, the method 60000 includes uploading the one or moreimages of the target area to a server.

In some embodiments, the cameras may be time synchronized with eachother. In some embodiments, the cameras may also coordinate with otherequipment on the agricultural vehicle such as light sensors, and othersensors. For example, based on ambient light, camera exposure or framecapture rate may be adjusted to obtain a high-quality image.

7. Embodiments of Image Analysis

In some embodiments, the analyzing includes using the one or more imagesas an input to a machine learning implementation to generate the target.The target may be identified, for example, using a bounding box around aregion of interest in the one or more images. In some embodiments, theimage analysis may use semantic segmentation, including pixelsegmentation and labeling various objects in the one or more images asfruit, flower, random noise, background, landmark, etc.

In some embodiments, ML implementation includes use of a neural network.

In some embodiments, an object of interest may comprise a subset of allpixels forming the target. For example, a target may correspond to abounding box of an undesirable vegetation (e.g., a weed), while theobject of interest may correspond to contours of the weed. In someembodiments, an ML implementation may be configured to draw rectangularboxes around a number of pixels. This may be in the form of an actualbox being drawn in the image, or just a storage of notes of the 4corners of a box within the image and may include the label name/objecttype. In some embodiments, the box may be drawn such that the entire boxis an object of interest. For example, a weed, a crop leaf, a rock, abed, a trough, a fruit, a bud, a flower, etc. For example, objectswithin the box may not be distinguished as foreground objects with abackground within the box. In some implementations, an ML implementationmay be configured to perform pixel segmentation of an entire image(e.g., 4K ×2 from stereo) where every pixel is either an object ofinterest that ML model knows or is considered to be background.

In some embodiments, the ML implementation can draw Boxes around objectsof interest, followed by another ML or a same ML model performing pixelsegmentation on each individual box. In some embodiments, a computervision technique may be used to ease the load. For example, a workflow150000 depicted in FIG. 45A may be used. As shown, at 150200, CV drawsboxes around interesting clusters of pixels. For example, referring toFIG. 35A, in some embodiments, CV may be used to identify rows orstripes of the ingested images 50000 in which the rows of vegetationbelong (e.g., two rows are depicted in FIG. 35A). In some embodiments,alternatively or additionally, ML may also be used to perform thiscoarse identification to identify portions of images were futuredetection should work on. This way, the number if pixels to be handledby the next computing operation could be reduced. Next, ML may be usedto perform ML detection (150400) on those clusters of pixels. Forexample, referring again to FIG. 35A, ML may be used to put the dashedbounding boxes or borders around pixel clusters that seem to includeobjects of interest (e.g., treatment targets). Next, at 150600, furtherrefinement of the detection may be performed. For example, CV or ML maybe used inside each of the identified areas to further identifyforeground and background within each of the boxes (e.g., as depicted inFIG. 35A). For example, the CV or ML algorithm may separate out brownbackground (ground) from green objects (vegetation) or objects withother non-brown colors (e.g., a flower or a fruit). For example, in someimplementations, as shown in an optional operation 150800, ML may drawboxes and CV (or another ML) may partition background with foreground.In some implementations, instead, or in addition, the ML or CV algorithmmay generate superpixels in the boxes. The superpixels may represent agroup of pixels that is determined to have a similar visualcharacteristic. For example, the underlying image portion may be a rigidobject whose pixels will exhibit similar behavior of motion, rotation orcoloring. Superpixel refinement may offer a reduced computational loadduring subsequent processing of the images.

In some embodiments, image analysis may be performed to identify targetsthat include unwanted objects such as weeds. For example, an MLalgorithm may be programmed to eliminate known objects from the images(e.g., carrots or another crop) and the remaining objects may beclassified as being unwanted objects. Alternatively, or in addition, theML algorithm may be trained to identify a number of weeds whose pictureshave been previously used to train the ML algorithm. Accordingly, allobjects that are similar to the learned images of weeds may be marked asbeing unwanted objects.

In some embodiments, the image analysis may use instance segmentation.Here, pixels may be separated into separate objects such as fruit 1,flower 1, flower 2, weed, dirt, noise, and so on. In some embodiments,the image analysis may use a green segmentation technique in which acarrot may be recognized separate from green portion. Other examples ofimage analysis, including use of ML and/or CV techniques are describedwith reference to FIG. 45A.

At the end of the image analysis, in some embodiments, the ML algorithmmay produce a list of unwanted objects (e.g., weeds) in the analyzedimage.

In some implementations, two consecutive frames may have an overlappingcommon area because the camera may have travelled less distance betweentwo frames than the distance covered by each frame. In such cases, theimage analysis may simply use the “new” portion of the next image formobject detection and rely on results of previous object detection on theoverlapping portion of the image, where the objects will have simplymoved in a particular direction that is opposite to the direction ofmovement of the camera. Alternatively, in some embodiments, the MLalgorithm may run on each frame separately, with the object trackertracking validity of detected objects as described in the presentdocument.

In some embodiments, ingested images may be processed throughpre-processing to make them friendlier for a subsequent ML based objectdetection. One example of pre-processing is image rotation. Using poseinformation, an image may be rotated to lie along an imaginary planethat is an optimal plane for ML based object detection. For example,this plane may represent “ideal” flat ground. Another example ofpre-processing is color adjustment. An ingested image may be transformedinto a color space that is suitable for use by the trained ML model.Another example of pre-processing is image zooming in or zooming out,including performing warping functions and techniques better alignimages from frame to frame across time by one or more sensors or from afirst sensor to a second sensor to match, project, verify, or acombination thereof, capturing the sensor readings at the same time.This pre-processing may be used in case that the pose estimationindicates that the relative height of the camera from the ground hasdeviated above a threshold. Another example of pre-processing is imagerotation that may be used to adjust a misalignment between pose of theagricultural vehicle and the camera direction. In the above examples,the pre-processing of images can be performed in real-time on a movingplatform just before the step of processing the processed images orother sensor readings for object or keypoint identification.

8. Image Capture Embodiments

In some embodiments, the one or more images of the target area arecaptured using a main camera system positioned on the onsite platform10400 on the agricultural vehicle. In some embodiments the main camerasystem is configured to operate at a first frame capture rate. Thelocation of cameras typically will determine the area of farmland thanis captured in a single image. For example, a camera that is located at3 ft. height above ground will capture a smaller area of land comparedto a camera that is located at 5 ft. height, comparing the two cameraswith the same field of view. Consequently, an image captured from higherelevation may show vegetation as having relatively smaller size (interms of pixel dimensions) compared to an image taken from a lowerelevation. At the same time, the image taken from higher elevation mayinclude a greater number of objects of interest compared to an imagecaptured from a lower elevation than the higher elevation. Therefore,camera height may impact the performance of ML algorithm in detectingobjects and also may impact the amount of computational resources usedfor real-time detection of weeds and other undesirable vegetation.Accordingly, in some embodiments, ML models may be trained for use in acertain range of camera height, which may be continuously monitoredduring pose determination.

In various embodiments, image capture may be performed using cameras,RGB-D cameras, multispectral cameras in stereo or more operably linkedand time synced to each other, LiDARs, radars, infrared detectors,sonar, and so on. In some cases, there may be two sets of cameras—oneset for pose detection to obtain global registry of pose of a vehiclesupporting one or more spraying or plant treatment systems andsubsystems and mapping an environment at global level, and other set ormore sets operably connected to one or more treatment systems alsosupported by the vehicle for and object detection and another set fortreatment verification at a local level, such that the sensors aresensing specific plant objects, e.g., as described in Section 6 of thepresent document.

9. Ejector Operation Embodiments

In some embodiments, the activating the ejector 24000 includes causingthe ejector to eject a treatment substance such as spraying a liquidpesticide or herbicide towards the region of interest.

In some embodiments, the activating the ejector includes causing theejector to eject a laser beam towards the region of interest. Theejector mechanism may be pneumatically or hydraulically pressurized. Asolenoid may be control via an electric current into an open or a closeposition to release the liquid pesticide or herbicide in a desireddirection. In some embodiments, the parameters that control thetreatment such as spraying or laser beaming performed by the ejectorinclude a pose or an orientation of the ejector, a duration for whichthe ejector will treat such as eject or spray, a pulse pattern in casethe ejector is desired to perform more than one consecutive ejections.Such a mode may be used, for example, to allow the ejected liquid to beassimilated over the target in smaller quantity sprayings at rapidsuccessions.

In some embodiments, the ejector may have a turret mounted on a turretand being able to move or rotate around multiple axes such as atdifferent angles, pitch and yaw. The turret's nozzle may be connected toa storage tank that stored liquid to be sprayed. An example of anejector mechanism is disclosed in the previously mentioned U.S. patentapp. Ser. No. 16/724,263, entitled “TARGETING AGRICULTURAL OBJECTS TOAPPLY UNITS OF TREATMENT AUTONOMOUSLY.”

10. Target Area Identification Embodiments

As described in the present document, the real-time processing engine20000 may check ingested images for objects (see, e.g., operations at21600). In various embodiments, the ingested one or more images may beanalyzed using a variety of different techniques. These techniques mayuse ML or computer vision or other image processing and objectidentification embodiment.

In some embodiments, the analyzing the one or more images comprisescomparing with a template that includes one or more images of weedplants, and wherein the region of interest corresponds to image areathat matches one or more templates. Additionally, the ingested sensorreadings can be sensor readings from non-camera related sensors orspecifically visible color related camera sensors, such as radar, sonar,lidar, infrared sensors, multispectral sensors. For example, templatepoint clouds, keypoints, or clusters can be accessed to compare withother point clouds, keypoints, or clusters of points capture in realtime for object identification. In another example a combination ofsensed signals including that of captured images, captured multispectralimagery, point clouds, or a combination thereof, can be used astemplates to detect objects in real time from one or more sensorsingesting sensor readings in real time.

In some embodiments, the analyzing the one or more images comprisescomparing with a template that includes one or more images of a crop,and wherein the region of interest corresponds to image area having amismatch with the template.

In some embodiments, a multi-image post-processing may be performed onthe identified target to ensure robustness of the identificationprocess. For example, if one image yields a target that is identified asa weed, the ML algorithm may check next N images (N an integer) forpresence or absence of the identified weed. For example, N may be equalto 3 to 7 consecutive images that are checked to see if the weeddetected in the first image is also seen in the second image, the thirdimage, and so on. For example, in some embodiments, based on the pose ofthe vehicle and a direction of movement of the vehicle, the ML algorithmmay expect a target detected around location (x, y) in an image to benear a position (x+dx, y+dy), in a next frame. Here dx and dy representeffective movement of the target due to the movement of the vehicle (andtherefore camera) in a 2-dimensional reference frame. Afteraffirmatively confirming that the target is being tracked with highaccuracy in consecutive N frames, the system may activate the ejectormechanism as described herein. Additional techniques for target trackingare also disclosed in Section 3 and with respect to FIG. 34 .

In some embodiments, the image analysis may provide a target area whichis a bounding box or contours of portions of an image, determined orgenerated by performing semantic segmentation of at least a portion ofan image frame, performing superpixel segmentation, of an objectidentified by the ML image analysis, further described in Section 11. Inaddition, a further determination may be made regarding a preciselocation within the target area where the ejector should be aimed toshoot at. For example, depending on the type of object (e.g., a varietyof weed), a pre-determined rule about which part of the weed should theherbicide point to may be used. For example, for certain types of weeds,the ejector may be pointed towards roots, while for other types ofweeds, the ejector may be pointed towards leaves, and so on. In somecases, if a specific critical point within the target area cannot beidentified or is not required, a volumetric approach may be used toshoot the entire target area with a relatively large bolus of theherbicide.

11. ML Embodiments

In some implementations, a ML system may be implemented on the vehiclethat is configured to operate in the agricultural setting. Variousaspects of the ML system are depicted in FIG. 32C, and includeembodiments that provide target inclusion capability (23000), targetexclusion capability (23200), various ML training methods (23400) andimplementations of ML (236), as further described in the presentdocument. The ML system may be implemented using a general-purposecomputer or preferably using a specialized process that is configuredwith an ML library. The ML system may be implemented on two differentlocations. A first portion, which may mainly perform training andbuilding of ML models, may be implemented using the offsite computingresources 10200. The first portion, mainly used for feature extractionand object detection, or classification, or both using a trained MLmodel, may be continually updated upon further training and implementedby the onsite platform 10400. At the beginning of an in-field session,the offsite computing resources 10200 may communicate most recent MLmodels to the onsite platform 10400 via the link 10800. In particular,these models may be used to identify objects using one of severaltechniques. For example, in some embodiments, supervised learning, orunsupervised learning, may be used to train the ML model using a numberof images of different object that the onsite platform 104 may encounterin the field. These objects include various images of a desired crop orfruit or flower that is intended to be cultivated and/or types of weedsand other undesirable vegetation that is to be eliminated from thefield. The objects in these images may be labeled according to theobject name. For example, previously ingested images that were correctlyidentified may be used for the supervised-learning-based approach.Active learning techniques may be applied to produce better candidateimages for human labelling as ground truths for further training of amachine learning model. For example, a machine learning model can beconfigured to determine that certain image frames, or sequence ofcontinuous frames, ingested from a plurality of continuous framesingested in an observation and treatment trial on a geographic boundarydo not detect any objects of interest for targeting or from omittingfrom targeting. For example, a vehicle can pass through a patch of dirtwithout, in reality, any weeds, plants, or crops for a few meters. Uponuploading a continuous set of image frames to a server for analysis andfor labelling, a machine learning algorithm can be applied to detectedan optimal subset of frames for human labelling or quality control,including for example, excluding the sequence of image frames capturingthe few meters of dirt without any weeds, plants, or crops. Activelearning can also be applied for example, to determine common landmarksfrom frame to frame that are not necessarily plant objects of interest,such as target plants for treating. Once common landmarks areidentified, the system via active learning can produce a subset of imageframes for human labelling or quality control by removing or reducingimages that have common landmarks as that of other images to furtherreduce redundancy of image quality analysis. Additionally, activelearning techniques can be applied such that one or more machinelearning algorithms analyzes an entire set of ingested images andperformed detections, classifications, labelling, pixel classification,or bounding box labelling, such that detections above a certainthreshold can be used as training data and those that do not meet athreshold can be sent to a human for labelling, classifying, orperforming quality control. Additionally, propagation techniques,including forward propagation and backpropagation can be applied tofurther provide resources as training datasets for training the machinelearning model. Additionally, unsupervised learning techniques can beapplied

In some embodiments, labeled data may be used to train the ML models.The labeled data may include exclusion targets (e.g., objects that areto be identified as being desirable or high value objects, whileeverything else being considered low value or undesirable), or inclusiontargets (e.g., objects that are identified as targets for elimination bytreatment, while everything else is to be preserved). The labeled datamay provide “ground truth” images that are known to, for sure, showweeds and/or desirable vegetation. The training data may also use rawdata that is not labeled in order for ML algorithm is able to train on.In some embodiments, ML models can be models configured to detectsalient points or cluster of points of a given image or point cloud. Inone example, unsupervised learning can be applied to train machinelearning models to analyze unlabeled data. Additionally, an ML model canbe configured to detect a certain spectrum of color, or a certain shapethat are of interest in an agricultural environment, particularly staticobjects such as tree trunks, row farm beds and troughs, etc.

In some embodiments, the ML model may be trained by trial and error. Inthese embodiments, a human supervision may be provided at the offsitecomputing resources 10200 for improving accuracy of object detection bymonitoring and correcting identification errors.

FIG. 33 shows an example of how offsite computing resources 10200 mayacquire data for improving ML models during use. Data may be acquiredfrom various sources such as raw data (30200) from field videos, or datauploaded by onsite platforms 10400 after completing their runs (e.g.,via link 11000). A human operator or an artificial intelligencealgorithm (AI) may be used to label (30400) some frames of raw data suchthat these frames may be used as labeled data for training. The raw dataand the labelled or training data may be input to an ML pipeline (306)that undergoes training as described in the present document, resultingin new or revised ML models (30600) that are better trained (31000).Such better trained ML models, called ML 2.0 31200, in name only) may bedownloaded to the onsite platforms 104 via the link 10800 on a periodicbasis such as once every day or at the beginning of a run or trial. Insome embodiments, transfer learning techniques can be applied to machinelearning models configured to detect plant objects of a certain typelocated on a first type of agricultural environment and trained ondatasets of plant objects or labelled plant objects of the certain type,and applied to detect plant objects of another type located on a secondtype of agricultural environment. In this example, an onlineagricultural observation and treatment system, supported by anautonomous vehicle, having one or more machine learning modelsdownloaded and stored on a local database and accessed by a localcompute unit of the treatment system, can perform agriculture relatedfunctions on various types of farms or orchards, continuously ingestdatasets of different objects, and train on all of the different typesof datasets to improve the machine learning model such that thetreatment system can improve performance on any agricultural environmenttreating any type of plant.

In some embodiments, after a ML model is downloaded to the onsiteplatform 10400, the ingested images 50000 may be divided into tiles55000, as depicted in FIG. 35B. The tiles are a smaller portion of theimage 50000, typically rectangular, that may be scanned from one end ofthe image to the other end of the image. In some embodiments,consecutive horizontally displaced tiles may be overlapping. In someembodiments, the tiles may be non-overlapping. For example, an entireimage may be processed as a number of tiles starting from left to rightand progressing downwards from the top of the image to the bottom ofimage. The sliding of tile region may be overlapping in the horizontalor vertical direction. In some embodiments, non-overlapping tiles may beused to reduce amount of computations to be performed in real-time.Alternatively, in some embodiments, overlapping tiling may be used toimprove accuracy of object detection. In some embodiments, a mix ofoverlapping and non-overlapping tiles may be used. For example, inregions of lower illumination or lower color contrast, overlapping tilesmay be used to improve accurate object detection by the ML mode.

The size of tiles may be chosen to be larger than an expected size of aregion or a bounding box of objects expected to be found in the image.In some cases, the ML model may be adapted to varying tile sizes duringoffline training. In some embodiments, a pre-processing stage may beused to eliminate certain parts of an image before an ML model isapplied to the image for object detection. For example, large flat areaof constant texture may be identified by pre-processing and removed fromthe image prior to the tiling. In some embodiments, the ML modelsimplemented on the onsite platform 10400 may be configured to detectobjects only in the new area of image that has entered in the view ofthe camera from a previous image to a next image.

In some embodiments, different ML algorithms or models may be usedsimultaneously to detect objects. For example, one ML model may be usedfor identifying apples, another may be used for identifying pears, yetanother for detecting weeds, and so on. In a typical farming situation,it may be possible to encounter 100s of different objects—includingflowers, fruits, leaves, and so on. The task of detection of usefulobjects from among these multitude of possible objects may be simplifiedby selecting ML models that are specifically trained for excluding orincluding a certain number of these objects, as is described next. Insome other embodiments, a single ML algorithm or model may be used totrain on all of the varying types of objects and salient points forobject or keypoint detection.

In some embodiments, a multi-level strategy may be used. For example, afirst run of a first ML algorithm (or a first CV algorithm) may be usedto simply mark area of interest and eliminate areas in an image where noobjects are present. This may be followed by one or more second MLalgorithms (or CV algorithm) for detecting objects in the reduced imagedata that has been filtered through the first ML algorithm. The first MLalgorithm may, for example, identify a dominant color that should beused by the second level of ML analysis. FIG. 38 shows an example ofsuch a multi-level strategy 80000 wherein the first ML algorithm, or CValgorithm, may be applied, at 80200, on an entire image, followed by oneor more second level ML (or CV) algorithms, at 80400, on reduced imagedata. FIG. 35A shows another example in which an ingested image includesapproximately two rows of a crop and some weeds, and the image on theright identifies a number of areas for which additional ML objectdetection may be performed to either exclude or include objects fortreating. In some other embodiments, a combination of computer visiontechniques, ML techniques, or a combination thereof, can be applied inmultiple levels to better or more quickly with using fewer resources orsmaller computational load to detect objects and its location in thereal world.

12. ML Target Inclusion (23000)

In some embodiments, ML may train for inclusion of targets. As furtherdiscussed in Section 11, inclusion objects may be weeds, insects orother objects detected in the ingested or captured images that may betarget for ejection action for treatment such as spraying with achemical or laser burning. In these embodiments, a library of possibleobjects detected in a farm may be maintained. New objects (e.g.,previously unseen weeds) may be added to this library and used for MLtraining. The addition of new objects may be done using offsitecomputing resources 10200 and may be performed during two runs of theagricultural vehicle such that newly detected objects may be used totrain ML models that are used for object detection in the next run ofthe agricultural vehicle.

13. ML Target Exclusion (23200)

In some embodiments, ML may be trained for excluding targets. Forexample, a particular crop or fruit or flower may be considered forexclusion from treatment or spraying, while everything else may be okayto treat. In a farming situation, typically, a farm may be growing onlyone specific crop and therefore desirable objects may be limited to asmall number—e.g., leaves, fruits or flower of the crop. In comparisonthe number of undesirable objects may be large such as a number ofpossible weeds or other undesirable growth that may be seen on the farm.Therefore, such embodiments may reduce the amount of computationalresource needed for object detection due to relatively fewer desirableobjects compared to undesirable objects.

In some embodiments, a crop may be a root or a tuber that largely liesunderground and therefore may not be clearly visible to a camera fromabove. For example, in some embodiments, an ML algorithm may be trainedto identify everything that looks to be green leafy. From all suchobjects, the objects that match leafiness of a crop (e.g., carrotleaves) may be excluded from identifying as a treatment target and allother green leafy objects may be considered treatment targets. Forexample, semantic segmentation models that identify objects based oncolor may be used. In some embodiments, ML algorithms may be trained todetect green objects using a gamut of green shades and the gamut may besuccessively improved based on training.

14. ML Training/Feature Extraction (23400)

In some embodiments, the ML algorithm may extract features of theingested frame during the analyzing operation and during checking forobjects 21600. Here, a feature may be a measurable piece of image data.The measurable piece or a combination of multiple such measurable piecesmay be used to distinguish the object from other objects. For example, ashape of a line, an area and a corresponding color, or a range of colorswithin the object may be some examples of features of the object. Insome embodiments, the object may be divided into multiple portions(e.g., leaves, stem, etc.) with each portion of the object having afeature associated with it.

In some embodiments, during ML training, both the features of the objectand how to divide an object into multiple portions may be trained forthe ML model. For example, one particular type of weed may not show anyflowers and may be characterized by only leaves portion and stemportion, while another type of weed may be characterized additionally bya flower portion. In some embodiments, features may be extracted using ahistogram oriented gradient. In some embodiments, the features may beextracted using a Haar cascade implementation. In some embodiments,scale-invariant feature transforms (SIFT), FAST, SURF, ORB, or acombination thereof, may be used to detect salient features or clusterof points in a given sensor reading. For example, these computer visiontechniques can be used to take into account that leaves may be invarious orientations with respect to the camera that captures theimages. Other techniques may be implemented to detect corners, edges,lines, blobs, keypoints, etc.

15. Deep Learning

In some embodiments, object identification may be performed using a deeplearning algorithm. The deep learning algorithm may be implemented usinga deep neural network. Images may be input to the deep learningalgorithm which produces a list of objects perceived in the inputimages. For example, a multi-layer perceptron architecture (MLP) forimplementing the deep learning. In some embodiments, a convolutionalneural network (CNN) may be used for object identification. The CNN mayuse a set of hyperparameters that characterize or describe the operationof the CNN. Such parameters include number of hidden layers, activationfunctions, error functions, batch size, and so on.

In some embodiments, the offsite computing resources 10200 may be usedto optimize hyperparameters for the machine learning algorithm used bythe onsite platform 10400. For example, at the beginning of each runmade by the onsite platform 10400 in the field, the offsite computingresources 10200 may download (e.g., at 21200) a new set of models or MLbehavior to the onsite platform 10400. In one advantageous aspect, suchan architecture provides a flexible use of resources for farmingoperation where an onsite platform 10400 may learn to perform better notjust based on its own previous data but also based on data collectedfrom farm runs of other agricultural vehicles and corresponding onsiteplatforms that may also be controlled by the offsite computingresources.

FIG. 40 shows an example of a neural network NN 100000 used forimplementing deep learning for identification of various objects asdescribed in the present document. The NN 100000 includes multiplelayers 100600 of neurons that operate upon input 100200 to produce anoutput 100400. The NN 100000 may be characterized by a number ofhyperparameters 100800. In some embodiments, the input 100200 mayinclude one or more ingested images. These images may be, for example,images captured by cameras on the onsite platform 104. In someembodiments, the output 1004 may include a list of objects, a boundingbox for each identified object, a confidence number for each identifiedobject or another image parameter.

The hyperparameters 100800 for the NN 100000 may be determined based ontraining that may be performed using sample images and/or previous runsof the agricultural vehicle. One hyperparameter may correspond to anumber of layers 100600 used for the ML implementation. This number maybe determined based on previous runs as a trade-off between one or moreof a computational complexity, real-timeness of the ML results, visualcomplexity of vegetation of a field, and so on. In some embodiments, oneof the hyperparameters may comprises an input activation function usedfor handling input 100200. The input activation function may be, forexample, a rectifier linear unit (ReLU) activation function. In someembodiments, one of the hyperparameters may correspond to an outputactivation function. The output activation function may be, for example,a normalized exponential function. These activation functions mayspecially be suited for object recognition and various parameters ofthese functions may be trained/optimized based on test data or in-fieldresults. In some embodiments, one of the hyperparameters may correspondto an error function used for determining ML object detection errors. Insome embodiments, mean square error may be used. In some embodiments, ahuman eye model mask may be used for determining the error in ML objectdetection. In some embodiments, one of the hyperparameters maycorrespond to a number of samples used for parameter updates. A tradeoffmay be performed between large number of samples between parameterupdates (which requires a large amount of memory but learn faster)versus small number of samples between parameter updates. The actualnumber may depend on operational parameters such as computing resourcesin the onsite platform 10400 of a specific agricultural vehicle. In someembodiments, one of the hyperparameters corresponds to an optimizationalgorithm that is used by the NN 100000 for object detection using theML algorithm.

16. One Example Workflow

In some embodiments, an onsite platform 10400 may operate as follows.The agricultural vehicle may be configured to implement the onsitesystem 10400. At the beginning of a field run, an ML model (or multipleML models) may be loaded onto the onsite system 10400 from the offsitecomputing resources. The agricultural vehicle may drive through a pathin a field, continuously monitoring its pose via a plurality of timesynced and fused sensors including GPS, IMU, rotational encoders, imagesensors and point cloud sensors configured to obtain global registry ofan environment, for example by performing visual SLAM and visualodometry, among other sensors configured to obtain pose of a vehiclesupporting an agricultural treatment system or subsystems. Based on theknowledge about initial pose, the subsystems of the agriculturalobservation and treatment system supported by the vehicle, can pointadditional sensors, including cameras, towards the field that includesdesirable and undesirable vegetation growth. The agricultural vehiclemay keep moving forward (e.g., along a row of crop plantings) andcapture images of the field at a certain rate. For example, the framecapture rate may be 60 to 200 frames per second. Based on the pose andthe direction of travel, the onsite platform 10400 of the agriculturalvehicle may estimate distance moved between two consecutive imagecaptures, either as an actual distance, or a relative distance in unitsof pixels of the image resolution.

In parallel with image capture operation, the onsite platform 10400 mayingest the captured images to the ML algorithm loaded on the onsiteplatform 10400 to perform object detection. The ML algorithm may, forexample, provide a list of weeds and locations or regions in which theweed are located within the frame. Based on the detection of one or moreobjects of interest in one or more images captured, the agriculturalobservation and treatment system can then determine the real-worldlocation of the object of interest detected in the image.

For example, when a weed is detected in an image, the onsite platform10400 may track the movement of the weed from one frame to next (ingeneral N frames, but practically between 3 to 8 frames). Upondetermining with a high degree of confidence that (a) the detectedobject is a weed, and (b) the weed is a threshold distance away fromdesired vegetation such that treating or spraying the weed withherbicide will not damage the crop, the onsite platform 104 may point aspray nozzle or a laser towards the location of the weed and spray theherbicide or the laser at the weed. The onsite platform 10400 mayimplement an object tracker algorithm to track movement of variousobjects from one frame to next, with the objects being detected by theML algorithm. At any time, the object tracker algorithm may track 1 to50 different objects in the frame. The exact number may be adaptive andmay also be learned from the ML algorithm. For example, in some cases,after crop object is identified (e.g., which may be up to 6 to 10objects in the image), every other object may be assumed to be a weed.Alternatively, specific weeds may be tracked and treated. Thus, the MLalgorithm may operate in conjunction with the object tracker algorithmto implement the ML target inclusion and the ML target exclusionstrategies described in Sections 12 and 13.

The onsite platform 10400 may capture the treatment event using a camerafor quality control. Subsequently, the video may be uploaded to theoffsite computing resources for verification of accuracy of thetreatment either manually or by a machine.

The onsite platform 10400 may transmit the captured images to theoffsite computing resources for monitoring effectivity. For example, ifML algorithm identifies a weed and a crop in a certain image, a boundingbox of the crop may be subtracted from the image and the resulting weedregion may be compared with a previous capture at the same exactlocation to check effectiveness of the weed-killing strategy.

17. Edge Server Embodiments

Revisiting FIG. 31 , in some embodiments an edge server may be used inthe system 10000 depicted in FIG. 31 . While not shown explicitly inFIG. 31 , the edge server may be positioned between the onsite platform104 and the offsite computing resources 10200. For example, an edgeserver may be installed at a shed or a barn in a farm (e.g., withindirect wireless communication range) and may be operating usingelectricity. The edge server may be used to achieve a trade-off betweenthe computational power limitations on the onsite platform 10400 and therelative speed with which certain results (e.g., decision to treat) areneeded to meet a real-time operation of the farming vehicle.

In some embodiments, the onsite platform 104 may perform preprocessingof images. As described in Section 7, the preprocessing may includecolor space conversion, resolution adjustment, and so on.

In some embodiments, the onsite platform 10400 may send captured imagesto the edge server to perform the first level of ML algorithm processing(80200, described in FIG. 38 ). This first level of ML algorithm mayoperate on a large number of pixels (e.g., 4K images) and may reduce theamount of image data on which ML algorithms may be run. The second MLalgorithms may be run on the onsite platform 104 on the reduced data,which may thus need a reduced amount of computing resources and power.The onsite platform 10400 may perform the tasks of object tracking andactivation of ejector mechanism to ensure a minimal delay to treat,e.g., spray herbicide at a weed.

Alternatively, in some embodiments, all ML algorithms may be run on theedge server (e.g., operations 21400, 21600) while the onsite platform10400 may simply take the target object list (e.g., target area 12000)and perform operations of system start (including pose estimation)21200, object confirmation 21800, object tracking 22000 and treatmentactivation 22200 and 22400.

FIG. 42 shows an example method 120000 of operating an edge server. Themethod 120000 includes receiving, at 120200, a configuration and one ormore ML models from offsite computing resources. For example, in someembodiments, the edge server may communicate a request on behalf of anagricultural vehicle. At the time an agricultural vehicle is powered onand put into use in a field, the agricultural vehicle may register withan edge server that is operating near the field. After the initialhandshake or one-way registration is complete, the edge server maytransmit a message to the offsite computing resources. After thetransmission of the message, the edge server may receive operationaldata and parameters from the offsite computing resources. Theoperational data may include one or more ML models that the current runof the agricultural vehicle will use. The parameters may be used forconfiguring other operational features such as details of imageprocessing pre-processing, splitting of certain tasks between edgeserver and the agricultural vehicle, and so on.

At 120400, the method 120000 includes downloading operational data andparameters to a field equipment. In some embodiments, a securecommunication protocol such as using encryption or authenticationcertificate may be used for securing the operational data and parameterdownloading.

At 120600, the method 120000 includes receiving, by the edge server,run-time data from the field equipment. The run-time data generated bythe agricultural vehicle and transmitted to the edge server may dependon the parameters that were configured for the operation of theagricultural vehicle. For example, in some embodiments, the run-timedata may include images captured by the cameras on the agriculturalvehicle. In some embodiments, the run-time data may include a partiallyprocessed image data (not the entire image, but a processed version suchas a downsampled version or a color reduced version, etc.). In someembodiments, ML algorithm may be split to run partly on the agriculturalvehicle and partly on the edge server. The split may be based on imagepanels or sub-portion or based on image groupings such as every otherimage. Accordingly, the run-time data may include corresponding partialimage data. In some embodiments, an output of ML processing of theimages may be transmitted. This output may include a listing of objectsidentified by the ML algorithm, including a confidence level. Thisinformation may be used by the edge server to perform object tracking.

At 120800, the method 120000 includes processing the run-time dataaccording to the configuration. The processing may depend on theparameters that configure the resource division between the agriculturalvehicle and the edge server. The resource division may depend onresource availability on the agricultural vehicle, the complexity ofoperation, and so on. For example, if it is determined that a particularfield operation entails tracking of a large number of objects (e.g.,above a threshold such as 10 objects), the computational resources onthe edge server may preferably be used. In some cases, adverse fieldconditions such as low illumination or rain may trigger a greater use ofcomputational resources on the edge server, which typically may behouses in a better controlled environment. Accordingly, the processingat 120800 may include image pre-processing, object detection using MLalgorithm, object tracking, decision making regarding objects to treat,collection and compilation of ejector operation videos/images and so on.

18. Additional Example Embodiments

FIG. 36 is a flowchart for a method 60000. The method 60000 includesobtaining (60200), by a computer system mounted on an agriculturalvehicle, one or more images of a target area. The images may be obtainedusing the cameras and image capture embodiments described in the presentdocument, e.g., Section 8.

The method 60000 includes analyzing (60400), by the computer system, theone or more images to determine a target. Various techniques foranalyzing images are described in the present document, including, e.g.,Sections 7 and 11 to 15.

The method 60000 includes activating (60600), by the computer system, anejector onboard the agricultural vehicle to emit towards the target.Various techniques for treatment activation such as the activation ofejector to spray an herbicide or a laser are described in the presentdocument, e.g., Sections 3 and 9.

In some embodiments, two different cameras may be used to obtain depthperception to the target. Alternatively, or in addition, a single cameramay be used for obtaining depth information. The single camera may beoperated at two or more rate or density to obtain the depth information.

In some embodiments, the method 60000 includes uploading the one or moreimages of the operation of the ejector to a server. For example, theserver may an edge server.

In some embodiments, prior to the obtaining the one or more images ofthe target area, the computer system may be initialized for use asfollows. The initialization may include one or more of: (a) updating amachine learning algorithm and/or a machine learning model onboard thecomputer system, (b) determining a pose of the agricultural vehicle, (c)calibrating one or more sensors onboard the agricultural vehicle, or (d)performing diagnostics on the ejector. (e) time synchronize each sensor,lights, actuator/motor, and ejector.

In some embodiments, different ML models may be used for differentapplications of the vehicle. For example, different models may be usedfor weed identification, fruit identification, nut identification and soon. Alternatively, a ML model that encompasses two or more of the abovemay be used. Each ML model may be trained on one or more ML algorithms.For example, training may be performed via a CNN (convolutional neuralnetwork) or DNN (deep neural network) gradient descent method.Additional examples of training are disclosed in Sections 3 and 11 ofthe present document.

Some embodiments may preferably use the following technical solutions.

1. A method implemented by a treatment system (e.g., method 152000depicted in FIG. 45B) having one or more processor, a storage, and atreatment mechanism, comprising: obtaining (152200), by the treatmentsystem mountable on an agricultural vehicle and configured to implementa machine learning (ML) algorithm, one or more images of a region of anagricultural environment near the treatment system, wherein the one ormore images are captured from the region of a real-world whereagricultural target objects are expected to be present; determining(152400), by the treatment system, one or more parameters for use withthe ML algorithm, wherein at least one of the one or more parameters isbased on one or more ML models related to identification of anagricultural object; determining (152600), by the treatment system, areal-world target in the one or more images using the ML algorithm,wherein the ML algorithm is at least partly implemented using the one ormore processors of the treatment system; and applying (152800) atreatment to the real-world target by selectively activating thetreatment mechanism based on a result of the determining the target.

2. The method of solution 1, wherein the determining the targetincludes: identifying a subset of pixels of the one or more images for anext processing; detecting, using the ML algorithm, one or more areas inthe subset of pixels including an area that includes the target; andperforming refinement of the one or more areas.

3. The method of solution 2, wherein the identifying is performed usinga first computer vision technique and the refinement is performed usinga second computer vision technique.

4. The method of solution 3, wherein the refinement comprisesidentifying a foreground object and a background in the one or moreareas.

5. The method of solution 4, wherein the identifying is performed bysegmenting according to colors or by segmenting according to regions orby segmenting according to detected edges.

6. The method of solution 3, wherein the refinement comprisesidentifying superpixels in the one or more areas.

7. The method of solution 1, wherein the activating the treatmentmechanism includes emitting a fluid projectile towards the target.

8. The method of solution 1, wherein the activating the treatmentmechanism includes orienting the treatment mechanism towards the targetand emitting a beam of light towards the target.

9. The method of solution 1, wherein the selectively activatingcomprises activating the treatment mechanism in response to detectingthe target in multiple consecutive frames of the one or more images.

10. The method of solution 1, wherein the ML algorithm is implementedusing a convolutional neural network (CNN).

Some embodiments may preferably use the following technical solutions.

1. A method performed by a treatment system (e.g., method 153000depicted in FIG. 45C) having one or more processors, a storage, and atreatment mechanism, comprising: receiving (153200), by the treatmentsystem, sensor inputs including one or more images comprising one ormore agricultural objects; continuously performing (153400) a poseestimation of the treatment system based on sensor inputs that are timesynchronized and fused; identifying (153600) the one or moreagricultural objects as real-world target objects by analyzing the oneor more images; tracking (153800) the one or more agricultural objectsidentified by the analyzing; controlling (153802) an orientation of thetreatment mechanism according to the pose estimation for targeting theone or more agricultural objects; and activating (153804) the treatmentmechanism to treat the one or more agricultural objects according to theorientation.

2. The method of solution 1, wherein the performing the pose estimationincludes determining a global pose estimation using inputs from sensorsconfigured to receive sensor readings of a world environment causingdetermination of a position and an orientation of a vehicle on which thetreatment system is disposed.

3. The method of solution 2, wherein the determining the global poseestimation includes mapping a global scene of the treatment system.

4. The method of solution 2, wherein the sensors configured to receivesensor readings from the world environment include sensors that detectan (x, y, z) orientation.

5. The method of solution 1, wherein the performing the pose estimationcomprises determining a local pose estimation from sensors configured tosense a local environment of the treatment system and to determine alocalization and an orientation of components of the treatment system.

6. The method of solution 5, further comprising generating a mapping oflocal scenes based on the information received from the sensorsconfigured to sense the local environment and a global map indicative ofa global environment of the treatment system.

7. The method of solution 1, wherein the activating the treatmentmechanism includes emitting a beam of light towards the one or moreagricultural objects.

8. The method of solution 1, wherein the activating the treatmentmechanism includes emitting a fluid projectile towards the one or moreagricultural objects.

Additional embodiments and features of the above described solution setsare described throughout the present document.

In some embodiments, a vegetation control system includes a computersystem comprising one or more processors, an image capture systemconfigured to capture images of an environment near the computer systemand provide the images to the computer system; and a treatment mechanismconfigured to eject a treatment substance under a control of thecomputer system. The computer system is mountable on an agriculturalvehicle and is configured to implement a machine learning (ML)algorithm, and determine a pose of the computer system, wherein the posecomprises a global pose and a local pose; determine a target in one ormore images from the image capture system; and activate based on thepose and upon the determining the target, the treatment mechanism toemit the treatment substance towards the target. In some embodiments,the image capture system is further configured to capture feedbackimages of the target during operation of the treatment mechanism, andwherein the computer system is configured to analyze the feedback imagesto provide a correction signal to the operation of the treatmentmechanism. In some embodiments, the computer system analyzes thefeedback images using computer vision. In some embodiments, thecorrection signal comprises adjusting an ejection angle of the treatmentmechanism. In some embodiments, the correction signal causes thetreatment mechanism to perform a cleaning or an alignment or amodification to an operational parameter. In some embodiments, the imagecapture system is configured to capture the images of the environment ata first frame rate and a first frame resolution and capture the feedbackimages at a second frame rate and a second frame resolution, wherein thesecond frame rate is greater than the first frame rate or the secondframe resolution is less than the first frame resolution.

In some embodiments, a system of vegetation control includes one or moreoffsite computing resources representing offsite computing resources ofthe system, an onsite platform representing onsite computing resource ofthe system, wherein the onsite platform is configured to communicatewith the offsite computing resources through a communication coupling.

In some embodiments of the system, the onsite platform is configured toobtain a pose of the onsite platform, wherein the pose includes a globalpose and a local pose; determine a target in one or more images input tothe onsite platform, wherein the one or more images are obtained basedon an environment near the onsite platform; and activate, based on thepose and upon the determining the target, a treatment mechanism onboardan agricultural vehicle to emit towards the target. In some embodiments,the onsite platform is further configured to receive a machine learning(ML) model used to determine the target and/or transmit a result ofactivating the treatment mechanism. In some embodiments, the treatmentmechanism is configured to eject a chemical or a laser beam towards thetarget. In some embodiments, the system further includes an edge serverpositioned between the offsite computing resources and the onsitecomputing resources, wherein the edge serve is configured for: (1)facilitating a communication from the offsite computing resources to theonsite platform, (2) facilitating a communication from the onsiteplatform to the offsite computing resources, or (3) offload computingtasks from the onsite platform in coordination with the onsite platform.In some embodiments, the edge server offloads ML computing tasks fromthe onsite platform such that the onsite platform is limited toperforming computer vision analysis of the result of activating thetreatment mechanism. In some embodiments, the ML model is trained at theone or more offsite computing resources based on raw data obtained fromthe onsite platform.

In some embodiments, a treatment system that is included with an onsiteplatform may include one or more processors, one or more treatmentmechanisms or units that are configured to implement the above-describedsolutions.

Further features and embodiments of the above-described solutions aredisclosed throughout the present document.

From the foregoing, it will be appreciated that specific embodiments ofthe presently disclosed technology have been described herein forpurposes of illustration, but that various modifications may be madewithout deviating from the scope of the invention. Accordingly, thepresently disclosed technology is not limited except as by the appendedclaims.

Implementations of the subject matter and the functional operationsdescribed in this patent document can be implemented in various systems,digital electronic circuitry, or in computer software, firmware, orhardware, including the structures disclosed in this specification andtheir structural equivalents, or in combinations of one or more of them.Implementations of the subject matter described in this specificationcan be implemented as one or more computer program products, i.e., oneor more modules of computer program instructions encoded on a tangibleand non-transitory computer readable medium for execution by, or tocontrol the operation of, data processing apparatus. The computerreadable medium can be a machine-readable storage device, amachine-readable storage substrate, a memory device, a composition ofmatter effecting a machine-readable propagated signal, or a combinationof one or more of them. The term “data processing unit” or “dataprocessing apparatus” encompasses all apparatus, devices, and machinesfor processing data, including by way of example a programmableprocessor, a computer, or multiple processors or computers. Theapparatus can include, in addition to hardware, code that creates anexecution environment for the computer program in question, e.g., codethat constitutes processor firmware, a protocol stack, a databasemanagement system, an operating system, or a combination of one or moreof them.

A computer program (also known as a program, software, softwareapplication, script, or code) can be written in any form of programminglanguage, including compiled or interpreted languages, and it can bedeployed in any form, including as a stand-alone program or as a module,component, subroutine, or other unit suitable for use in a computingenvironment. A computer program does not necessarily correspond to afile in a file system. A program can be stored in a portion of a filethat holds other programs or data (e.g., one or more scripts stored in amarkup language document), in a single file dedicated to the program inquestion, or in multiple coordinated files (e.g., files that store oneor more modules, sub programs, or portions of code). A computer programcan be deployed to be executed on one computer or on multiple computersthat are located at one site or distributed across multiple sites andinterconnected by a communication network.

The processes and logic flows described in this specification can beperformed by one or more programmable processors executing one or morecomputer programs to perform functions by operating on input data andgenerating output. The processes and logic flows can also be performedby, and apparatus can also be implemented as, special purpose logiccircuitry, e.g., an FPGA (field programmable gate array) or an ASIC(application specific integrated circuit).

Processors suitable for the execution of a computer program include, byway of example, both general and special purpose microprocessors, andany one or more processors of any kind of digital computer. Generally, aprocessor will receive instructions and data from a read only memory ora random access memory or both. The essential elements of a computer area processor for performing instructions and one or more memory devicesfor storing instructions and data. Generally, a computer will alsoinclude, or be operatively coupled to receive data from or transfer datato, or both, one or more mass storage devices for storing data, e.g.,magnetic, magneto optical disks, or optical disks. However, a computerneed not have such devices. Computer readable media suitable for storingcomputer program instructions and data include all forms of nonvolatilememory, media and memory devices, including by way of examplesemiconductor memory devices, e.g., EPROM, EEPROM, and flash memorydevices. The processor and the memory can be supplemented by, orincorporated in, special purpose logic circuitry.

It is intended that the specification, together with the drawings, beconsidered exemplary only, where exemplary means an example. As usedherein, the use of “or” is intended to include “and/or”, unless thecontext clearly indicates otherwise.

While this patent document contains many specifics, these should not beconstrued as limitations on the scope of any invention or of what may beclaimed, but rather as descriptions of features that may be specific toparticular embodiments of particular inventions. Certain features thatare described in this patent document in the context of separateembodiments can also be implemented in combination in a singleembodiment. Conversely, various features that are described in thecontext of a single embodiment can also be implemented in multipleembodiments separately or in any suitable subcombination. Moreover,although features may be described above as acting in certaincombinations and even initially claimed as such, one or more featuresfrom a claimed combination can in some cases be excised from thecombination, and the claimed combination may be directed to asubcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particularorder, this should not be understood as requiring that such operationsbe performed in the particular order shown or in sequential order, orthat all illustrated operations be performed, to achieve desirableresults. Moreover, the separation of various system components in theembodiments described in this patent document should not be understoodas requiring such separation in all embodiments.

Only a few implementations and examples are described and otherimplementations, enhancements and variations can be made based on whatis described and illustrated in this patent document.

What is claimed is:
 1. A method performed by a treatment system on amoving platform, the treatment system having one or more processors, astorage, and a treatment mechanism, comprising: receiving, by thetreatment system, during operation in an agricultural environment, oneor more sensor readings comprising one or more agricultural objects inthe agricultural environment; identifying, one or more objects ofinterest from the one or more agricultural objects by analyzing the oneor more sensor readings; determining a first target object of the one ormore objects of interest for treatment; determining a predicted positionof the first target object in the agricultural environment relative tothe treatment mechanism wherein the predicted position of the firstobject is determined by tracking the first target object as thetreatment mechanism traverses the agricultural environment; andactivating the treatment mechanism to treat the first target object. 2.The method of claim 1, wherein the analyzing the one or more sensorreadings comprises analyzing the one or more sensor readings using amachine learning (ML) algorithm that uses one or more ML modelsconfigured to identify the one or more agricultural objects.
 3. Themethod of claim 2, wherein the ML algorithm is generated at least inpart based on transfer learning from prior ML models or prior versionsof the one or more ML models configured to detect plant objects, orlandmarks, or both, of a different agricultural environment than theagricultural environment.
 4. The method of claim 1, further comprisingorienting the treatment mechanism to point at the predicted position asthe treatment system tracks the first target object.
 5. The method ofclaim 1, further comprising determining a pose estimation of thetreatment system by determining a global pose using inputs from sensorsrigidly connected to the moving platform configured to receive sensorreadings of a world environment causing determination of a location andan orientation of the moving platform, determining a local pose fromsensors configured to sense a local environment of the treatment systemcausing determination of a localization and an orientation of thetreatment system, or a combination thereof.
 6. The method of claim 1,wherein tracking the first target object is performed for a period oftime subsequent to the determining the first target object fortreatment.
 7. The method of claim 1, wherein the one or more sensorreadings are one or more images captured by one or more image capturedevices.
 8. The method of claim 7, wherein the one or more objects ofinterests, including the first target object, is identified in a firstimage of the one or more images and the first target object is trackedvia multiple subsequent images of the one or more images.
 9. The methodof claim 1, wherein the activating the treatment mechanism includesemitting a laser or emitting a fluid projectile or fluid spray orconfiguring an end effector to physically interact with the first targetobject.
 10. The method of claim 1, further comprising detecting atreatment of the first target object.
 11. The method of claim 10,wherein detecting the treatment of the first target object includesdetecting a fluid, the fluid comprising a fluid projectile or a fluidspray, emitted from the treatment mechanism towards the first targetobject and determining whether the fluid will intercept the first targetobject, determining whether the fluid has intercepted the first targetobject, or a combination thereof.
 12. The method of claim 11, whereindetecting the fluid comprises performing image differencing of multipleimages.
 13. The method of claim 11, wherein detecting the fluidcomprises performing homography estimation based on the one or moresensor readings captured on the moving platform.
 14. The method of claim10, wherein detecting the treatment of the first target object includesdetecting a pattern, the pattern comprising a spot pattern, a splatpattern, or a splash pattern, on or near a portion of ground of theagricultural environment, and determining whether the pattern isidentified to be on or near the first target object, determining whetherthe first target object is identified to be on or near the pattern, or acombination thereof.
 15. The method of claim 14, wherein detecting thepattern comprises performing image differencing of multiple images. 16.The method of claim 14, wherein detecting the pattern comprisesperforming homography estimation based on the one or more sensorreadings captured on the moving platform.
 17. The method of claim 10,further comprising logging the detected treatment and using a result ofthe logged detected treatment to generate performance metrics of thetreatment system, generate predicted characteristics of the first targetobject including a future yield, a future size, a future health, afuture disease, or a combination thereof, or a combination thereof. 18.The method of claim 1, further comprising logging one or more results ofthe identification of each of the one or more objects of interest.
 19. Atreatment system on a moving platform, the treatment system comprising:one or more processors, a storage configured to store data orprocessor-executable code, and a treatment mechanism under control ofthe one or more processors, wherein the one or more processors isconfigured to: receive, during operation in an agricultural environment,one or more sensor readings comprising one or more agricultural objectsin the agricultural environment; identify, one or more objects ofinterest from the one or more agricultural objects by analyzing the oneor more sensor readings; determine a first target object of the one ormore objects of interest for treatment; determine a predicted positionof the first target object in the agricultural environment relative tothe treatment mechanism wherein the predicted position of the firstobject is determined by tracking the first target object as thetreatment mechanism traverses the agricultural environment; and activatethe treatment mechanism to treat the first target object.
 20. Atreatment system on a moving platform, the treatment system comprising:one or more processors, a storage configured to store data orprocessor-executable code, and a treatment mechanism under control ofthe one or more processors, wherein the one or more processors isconfigured to: receive, during operation in an agricultural environment,one or more sensor readings comprising one or more agricultural objectsin the agricultural environment; identify, one or more objects ofinterest from the one or more agricultural objects by analyzing the oneor more sensor readings; determine a first target object of the one ormore objects of interest for treatment; determine a predicted positionof the first target object in the agricultural environment relative tothe treatment mechanism wherein the predicted position of the firstobject is determined by tracking the first target object as thetreatment mechanism traverses the agricultural environment; and activatethe treatment mechanism to treat the first target object.